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I. INTRODUCTION 

The second law of thermodynamics is, without a doubt, one of the most perfect laws in physics. 
Any reproducible violation of it, however small, would bring the discoverer great riches as well as a 
trip to Stockholm. The world's energy problems would be solved at one stroke. It is not possible 

to find any other law (except, perhaps, for super selection rules such as charge conservation) for 
which a proposed violation would bring more skepticism than this one. Not even Maxwell's laws 
of electricity or Newton's law of gravitation are so sacrosanct, for each has measurable corrections 
coming from quantum effects or general relativity. The law has caught the attention of poets 
and philosophers and has been called the greatest scientific achievement of the nineteenth century. 
Engcls disliked it, for it supported opposition to dialectical materialism, while Pope Pius XII 
regarded it as proving the existence of a higher being (Bazarow, 1964, Sect. 20). 

A. The basic questions 

In this paper we shall attempt to formulate the essential elements of classical thermodynamics 
of equilibrium states and deduce from them the second law as the principle of the increase of entropy. 
'Classical' means that there is no mention of statistical mechanics here and 'equilibrium' means 
that we deal only with states of systems in equilibrium and do not attempt to define quantities such 
as entropy and temperature for systems not in equilibrium. This is not to say that we arc concerned 
only with 'thermostatics' because, as will be explained more fully later, arbitrarily violent processes 
are allowed to occur in the passage from one equilibrium state to another. 

Most students of physics regard the subject as essentially perfectly understood and finished, 
and concentrate instead on the statistical mechanics from which it ostensibly can be derived. But 
many will admit, if pressed, that thermodynamics is something that they are sure that someone 
else understands and they will confess to some misgiving about the logic of the steps in traditional 
presentations that lead to the formulation of an entropy function. If classical thermodynamics is 
the most perfect physical theory it surely deserves a solid, unambiguous foundation free of little 
pictures involving unreal Carnot cycles and the like. [For examples of 'un-ordinary' Carnot cycles 
see (TruesdeU and Bharatha 1977, p. 48).] 

There are two aims to our presentation. One is frankly pedagogical, i.e., to formulate the 
foundations of the theory in a clear and unambiguous way. The second is to formulate equilibrium 
thermodynamics as an 'ideal physical theory', which is to say a theory in which there are well 
defined mathematical constructs and well defined rules for translating physical reality into these 
constructs; having done so the mathematics then grinds out whatever answers it can and these are 
then translated back into physical statements. The point here is that while 'physical intuition' is 
a useful guide for formulating the mathematical structure and may even be a source of inspiration 
for constructing mathematical proofs, it should not be necessary to rely on it once the initial 
'translation' into mathematical language has been given. These goals are not new, of course; see 
e.g., (Duistermaat, 1968), (Giles, 1964, Sect. 1.1) and (Serrin, 1986, Sect. 1.1). 

Indeed, it seems to us that many formulations of thermodynamics, including most textbook 
presentations, suffer from mixing the physics with the mathematics. Physics refers to the real 
world of experiments and results of measurement, the latter quantified in the form of numbers. 
Mathematics refers to a logical structure and to rules of calculation; usually these are built around 
numbers, but not always. Thus, mathematics has two functions: one is to provide a transparent 
logical structure with which to view physics and inspire experiment. The other is to be like a 
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mill into which the miller pours the grain of experiment and out of which comes the flour of 
verifiable predictions. It is astonishing that this paradigm works to perfection in thermodynamics. 
(Another good example is Newtonian mechanics, in which the relevant mathematical structure is 
the calculus.) Our theory of the second law concerns the mathematical structure, primarily. As such 
it starts with some axioms and proceeds with rules of logic to uncover some non-trivial theorems 
about the existence of entropy and some of its properties. We do, however, explain how physics 
leads us to these particular axioms and we explain the physical applicability of the theorems. 

As noted in I.C below, we have a total of 15 axioms, which might seem like a lot. We can assure 
the reader that any other mathematical structure that derives entropy with minimal assumptions 
will have at least that many, and usually more. (We could roll several axioms into one, as others 
often do, by using sub-headings, e.g., our A1-A6 might perfectly well be denoted by Al(i)-(vi).) 
The point is that we leave nothing to the imagination or to silent agreement; it is all laid out. 

It must also be emphasized that our desire to clarify the structure of classical equilibrium 
thermodynamics is not merely pedagogical and not merely nit-picking. If the law of entropy increase 
is ever going to be derived from statistical mechanics — a goal that has so far eluded the deepest 
thinkers — then it is important to be absolutely clear about what it is that one wants to derive. 

Many attempts have been made in the last century and a half to formulate the second law 
precisely and to quantify it by means of an entropy function. Three of these formulations are classic 
(Kestin, 1976), (see also Clausius (1850), Thomson (1849)) and they can be paraphrased as follows: 

Clausius: No process is possible, the sole result of which is that heat is transferred from a body 
to a hotter one. 

Kelvin (and Planck): No process is possible, the sole result of which is that a body is cooled 
and work is done. 

Caratheodory: In any neighborhood of any state there are states that cannot be reached from 
it by an adiabatic process. 

The crowning glory of thermodynamics is the quantification of these statements by means of 
a precise, measurable quantity called entropy. There are two kinds of problems, however. One is 
to give a precise meaning to the words above. What is 'heat'? What is 'hot' and 'cold'? What is 
'adiabatic'? What is a 'neighborhood'? Just about the only word that is relatively unambiguous 
is 'work' because it comes from mechanics. 

The second sort of problem involves the rules of logic that lead from these statements to an 
entropy. Is it really necessary to draw pictures, some of which arc false, or at least not self evident? 
What are all the hidden assumptions that enter the derivation of entropy? For instance, we all 
know that discontinuities can and do occur at phase transitions, but almost every presentation 
of classical thermodynamics is based on the differential calculus (which presupposes continuous 
derivatives), especially (Caratheodory, 1925) and (Truesdell-Bharata, 1977, p.xvii). 

We note, in passing, that the Clausius, Kelvin-Planck and Caratheodory formulations are all 
assertions about impossible processes. Our formulation will rely, instead, mainly on assertions about 
possible processes and thus is noticeably different. At the end of Section VII, where everything is 
succintly summarized, the relationship of these approaches is discussed. This discussion is left to 
the end because it it cannot be done without first presenting our results in some detail. Some 
readers might wish to start by glancing at Section VII. 

Of course we are neither the first nor, presumably, the last to present a derivation of the 
second law (in the sense of an entropy principle) that pretends to remove all confusion and, at the 
same time, to achieve an unparalleled precision of logic and structure. Indeed, such attempts have 



4 



multiplied in the past three or four decades. These other theories, reviewed in Sect. LB, appeal to 
their creators as much as ours does to us and we must therefore conclude that ultimately a question 
of 'taste' is involved. 

It is not easy to classify other approaches to the problem that concerns us. We shall attempt to 
do so briefly, but first let us state the problem clearly. Physical systems have certain states (which 
always mean equilibrium states in this paper) and, by means of certain actions, called adiabatic 
processes, it is possible to change the state of a system to some other state. (Warning: The word 
'adiabatic' is used in several ways in physics. Sometimes it means 'slow and gentle', which might 
conjure up the idea of a quasi-static process, but this is certainly not our intention. The usage 
we have in the back of our minds is 'without exchange of heat', but we shall avoid defining the 
word 'heat'. The operational meaning of 'adiabatic' will be defined later on, but for now the reader 
should simply accept it as singling out a particular class of processes about which certain physically 
interesting statements are going to be made.) Adiabatic processes do not have to be very gentle, 
and they certainly do not have to be describable by a curve in the space of equilibrium states. One 
is allowed, like the gorilla in a well-known advertisement for luggage, to jump up and down on the 
system and even dismantle it temporarily, provided the system returns to some equilibrium state 
at the end of the day. In thermodynamics, unlike mechanics, not all conceivable transitions are 
adiabatic and it is a nontrivial problem to characterize the allowed transitions. We shall characterize 
them as transitions that have no net effect on other systems except that energy has been exchanged 
with a mechanical source. The truly remarkable fact, which has many consequences, is that for 
every system there is a function, S, on the space of its (equilibrium) states, with the property that 
one can go adiabatically from a state X to a state Y if and only if S{X) < S(Y). This, in essence, 
is the 'entropy principle' (EP) (see subsection II. B). 

The S function can clearly be multiplied by an arbitrary constant and still continue to do 
its job, and thus it is not at all obvious that the function 5*1 for system 1 has anything to do 
with the function S2 for system 2. The second remarkable fact is that the S functions for all the 
thermodynamic systems in the universe can be simultaneously calibrated (i.e., the multiplicative 
constants can be determined) in such a way that the entropies are additive, i.e., the S function 
for a compound system is obtained merely by adding the S functions of the individual systems, 
S*! 2 = Si + 82- ('Compound' does not mean chemical compound; a compound system is just a 
collection of several systems.) To appreciate this fact it is necessary to recognize that the systems 
comprising a compound system can interact with each other in several ways, and therefore the 
possible adiabatic transitions in a compound are far more numerous than those allowed for separate, 
isolated systems. Nevertheless, the increase of the function Si + S2 continues to describe the 
adiabatic processes exactly — neither allowing more nor allowing less than actually occur. The 
statement Si{Xi) + 52(^2) < Si{X[) + S2{X^) does not require Si{Xi) < Si{X[). 

The main problem, from our point of view, is this: What properties of adiabatic processes 
permit us to construct such a function? To what extent is it unique? And what properties of the 
interactions of different systems in a compound system result in additive entropy functions? 

The existence of an entropy function can be discussed in principle, as in Section II, without 
parametrizing the equilibrium states by quantities such as energy, volume, etc.. But it is an 
additional fact that when states are parametrized in the conventional ways then the derivatives 
of S exist and contain all the information about the equation of state, e.g., the temperature T is 
defined by dS{U,V)/dU\y = l/T. 
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In our approach to the second law temperature is never formally invoked until the very end 
when the differentiability of S is proved — not even the more primitive relative notions of 'hotness' 
and 'coldness' are used. The priority of entropy is common in statistical mechanics and in some other 
approaches to thermod3mamics such as in (Tisza, 1966) and (Callen, 1985), but the elimination of 
hotness and coldness is not usual in thermodynamics, as the formulations of Clausius and Kelvin 
show. The laws of thermal equilibrium (Section V), in particular the zeroth law of thermodynamics, 
do play a crucial role for us by relating one system to another (and they are ultimately responsible 
for the fact that entropies can be adjusted to be additive), but thermal equilibrium is only an 
equivalence relation and, in our form, it is not a statement about hotness. It seems to us that 
temperature is far from being an 'obvious' physical quantity. It emerges, finally, as a derivative of 
entropy, and unlike qiiantities in mechanics or electromagnetism, such as forces and masses, it is 
not vectorial, i.e., it cannot be added or multiplied by a scalar. Even pressure, while it cannot be 
'added' in an unambiguous way, can at least be multiplied by a scalar. (Here, we are not speaking 
about changing a temperature scale; we mean that once a scale has been fixed, it does not mean 
very much to multiply a given temperature, e.g., the boiling point of water, by the number 17. 
Whatever meaning one might attach to this is surely not independent of the chosen scale. Indeed, 
is T the right variable or is it l/T? In relativity theory this question has led to an ongoing debate 
about the natural quantity to choose as the fourth component of a four-vector. On the other hand, 
it does mean something unambiguous, to multiply the pressure in the boiler by 17. Mechanics 
dictates the meaning.) 

Another mysterious quantity is 'heat'. No one has ever seen heat, nor will it ever be seen, 
smelled or touched. Clausius wrote about "the kind of motion we call heat" , but thermodynamics — 
either practical or theoretical — does not rely for its validity on the notion of molecules jumping 
around. There is no way to measure heat flux directly (other than by its effect on the source and 
sink) and, while we do not wish to be considered antediluvian, it remains true that 'caloric' accounts 
for physics at a macroscopic level just as well as 'heat' docs. The reader will find no mention of 
heat in our derivation of entropy, except as a mnemonic guide. 

To conclude this very brief outline of the main conceptual points, the concept of convexity 
has to be mentioned. It is well known, as Gibbs (Gibbs 1928), Maxwell and others emphasized, 
that thermodynamics without convex functions (e.g., free energy per unit volume as a function of 
density) may lead to unstable systems. (A good discussion of convexity is in (Wightman, 1979).) 
Despite this fact, convexity is almost invisible in most fundamental approaches to the second law. 
In our treatment it is essential for the description of simple systems in Section III, which are the 
building blocks of thermodynamics. 

The concepts and goals we have just enunciated will be discussed in more detail in the following 
sections. The reader who impatiently wants a quick survey of our results can jump to Section VII 
where it can be found in capsule form. We also draw the readers attention to the article (Lieb- 
Yngvason 1998), where a summary of this work appeared. Let us now turn to a brief discussion of 
other modes of thought about the questions we have raised. 

B. Other approaches 

The simplest solution to the problem of the foundation of thermodynamics is perhaps that of 
Tisza (1966), and expanded by Callen (1985) (see also (Guggenheim, 1933)), who, following the 
tradition of Gibbs (1928), postulate the existence of an additive entropy function from which all 
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equilibrium properties of a substance are then to be derived. This approach has the advantage of 
bringing one quickly to the applications of thermodynamics, but it leaves unstated such questions 
as: What physical assumptions are needed in order to insure the existence of such a function? By 
no means do we wish to minimize the importance of this approach, for the manifold implications 
of entropy are well known to be non-trivial and highly important theoretically and practically, as 
Gibbs was one of the first to show in detail in his great work (Gibbs, 1928). 

Among the many foundational works on the existence of entropy, the most relevant for our 
considerations and aims here are those that we might, for want of a better word, call 'order theoret- 
ical' because the emphasis is on the derivation of entropy from postulated properties of adiabatic 
processes. This line of thought goes back to Garatheodory (1909 and 1925), although there are 
some precursors (see Planck, 1926) and was particularly advocated by (Born, 1921 and 1964). This 
basic idea, if not Caratheodory's implementation of it with differential forms, was developed in 
various mutations in the works of Landsberg (956), Buchdahl (1958, 1960, 1962, 1966), Buchdahl 
and Greve (1962), Falk and Jung (1959), Bernstein (1960), Giles (964), Cooper (1967), Boyling, 
(1968, 1972), Roberts and Luce (1968), Duistermaat (1968), Hornix (1968), Rastall (1970), Zeleznik 
(1975) and Borchers (1981). The work of Boyling (1968, 1972), which takes off from the work of 
Bernstein (1960) is perhaps the most direct and rigorous expression of the original Cartheodory 
idea of using differential forms. See also the discussion in Landsberg (1970). 

Planck (1926) criticized some of Caratheodory's work for not identifying processes that are not 
adiabatic. He suggested basing thermodynamics on the fact that 'rubbing' is an adiabatic process 
that is not reversible, an idea he already had in his 1879 dissertation. From this it follows that while 
one can undo a rubbing operation by some means, one cannot do so adiabatically. We derive this 
principle of Planck from our axioms. It is very convenient because it means that in an adiabatic 
process one can effectively add as much 'heat' (colloquially speaking) as one wishes, but the one 
thing one cannot do is subtract heat, i.e., use a 'refrigerator'. 

Most authors introduce the idea of an 'empirical temperature', and later derive the absolute 
temperature scale. In the same vein they often also introduce an 'empirical entropy' and later 
derive a 'metric', or additive, entropy, e.g., (Falk and Jung, 1959) and (Buchdahl, 1958, et seq., 
1966), (Buchdahl and Greve, 1962), (Cooper, 1967). We avoid all this; one of our results, as stated 
above, is the derivation of absolute temperature directly, without ever mentioning even 'hot' and 
'cold'. 

One of the key concepts that is eventually needed, although it is not obvious at first, is that of 
the comparison principle (or hypothesis), (CH). It concerns classes of thermodynamic states and 
asserts that for any two states X and Y within a class one can either go adiabatically from X to 
Y, which we write as 

X ^ y, 

(pronounced "X precedes Y" or "y follows X") or else one can go from Y to X, i.e., Y -< 
X. Obviously, this is not always possible (we cannot transmute lead into gold, although we can 
transmute hydrogen plus oxygen into water), so we would like to be able to break up the universe 
of states into equivalence classes, inside each of which the hypothesis holds. It turns out that the 
key requirement for an equivalence relation is that if X ~< Y and Z ~< Y then either X ~< Z or 
Z -< X. Likewise, \i Y ^ X and Y ^ Z hy then either X ^ Z oy Z ^ X. We find this first 
clearly stated in Landsberg (1956) and it is also found in one form or another in many places, see 
e.g., (Falk and Jung, 1959), (Buchdahl, 1958, 1962), (Giles, 1964). However, all authors, except 
for Duistermaat (1968), seem to take this postulate for granted and do not feel obliged to obtain it 
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from something else. One of the central points in our work is to derive the comparison hypothesis. 
This is discussed further below. 

The formulation of the second law of thermodynamics that is closest to ours is that of Giles 
(Giles, 1964). His book is full of deep insights and we recommend it highly to the reader. It 
is a classic that does not appear to be as known and appreciated as it should. His derivation 
of entropy from a few postulates about adiabatic processes is impressive and was the starting 
point for a number of further investigations. The overlap of our work with Giles's is only partial 
(the foundational parts, mainly those in our section 11) and where there is overlap there are also 
differences. 

To define the entropy of a state, the starting point in both approaches is to let a process 

that by itself would be adiabatically impossible work against another one that is possible, so that 
the total process is adiabatically possible. The processes used by us and by Giles are, however, 
different; for instance Giles uses a fixed external calibrating system, whereas we define the entropy 
of a state by letting a system interact with a copy of itself. ( According to R. E. Barieau (quoted 
in (Hornix, 1967-1968)) Giles was unaware of the fact that predecessors of the idea of an external 
entropy meter can be discerned in (Lewis and Randall, 1923).) To be a bit more precise, Giles 
uses a standard process as a reference and counts how many times a reference process has to be 
repeated to counteract some multiple of the process whose entropy (or rather 'irreversibility') is 
to be determined. In contrast, we construct the entropy function for a single system in terms of 
the amount of substance in a reference state of 'high entropy' that can be converted into the state 
under investigation with the help of a reference state of 'low entropy'. (This is reminiscent of an 
old definition of heat by Laplace and Lavoisier (quoted in (Borchers, 1981)) in terms of the amount 
of ice that a body can melt.) We give a simple formula for the entropy; Giles's definition is less 
direct, in our view. However, when we calibrate the entropy functions of different systems with 
each other, we do find it convenient to use a third system as a 'standard' of comparison. 

Giles' work and ours use very little of the calculus. Contrary to almost all treatments, and 
contrary to the assertion (Truesdell-Bharata, 1977) that the differential calculus is the appropriate 
tool for thermodynamics, we and he agree that entropy and its essential properties can best be 
described by maximum principles instead of equations among derivatives. To be sure, real analysis 
does eventually come into the discussion, but only at an advanced stage (Sections HI and V in our 
treatment) . 

In Giles, too, temperature appears as a totally derived quantity, but Giles's derivation requires 
some assumptions, such as differentiability of the entropy. We prove the required differentiability 
from natural assumptions about the pressure. 

Among the differences, it can be mentioned that the 'cancellation law', which plays a key 
role in our proofs, is taken by Giles to be an axiom, whereas we derive it from the assumption of 
'stability', which is common to both approaches (see Section II for definitions). 

The most important point of contact, however, and at the same time the most significant 
difference, concerns the comparison hypothesis which, as we emphasized above, is a concept that 
plays an essential role, although this may not be apparent at first. This hypothesis serves to divide 
the universe nicely into equivalence classes of mutually accessible states. Giles takes the comparison 
property as an axiom and does not attempt to justify it from physical premises. The main part of 
our work is devoted to just that justification, and to inquire what happens if it is violated. (There 
is also a discussion of this point in (Giles, 1964, Sect 13.3) in connection with hysteresis.) To get an 
idea of what is involved, note that we can easily go adiabatically from cold hydrogen plus oxygen 
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to hot water and we can go from ice to hot water, but can we go either from the cold gases to ice or 
the reverse — as the comparison hypothesis demands? It would appear that the only real possibility, 
if there is one at all, is to invoke hydrolysis to dissociate the ice, but what if hydrolysis did not 
exist? In other examples the requisite machinery might not be available to save the comparison 
hypothesis. For this reason we prefer to derive it, when needed, from properties of 'simple systems' 
and not to invoke it when considering situations involving variable composition or particle number, 
as in Section VI. 

Another point of difference is the fact that convexity is central to our work. Giles mentions it, 
but it is not central in his work perhaps because he is considering more general systems than we 
do. To a large extent convexity eliminates the need for explicit topological considerations about 
state spaces, which otherwise has to be put in 'by hand'. 

Further developments of the Giles' approach are in (Cooper, 1967), (Roberts and Luce, 1968) 
and (Duistermaat, 1968). Cooper assumes the existence of an empirical temperature and intro- 
duces topological notions which permits certain simplifications. Roberts and Luce have an elegant 
formulation of the entropy principle, which is mathematically appealing and is based on axioms 
about the order relation, -<, (in particular the comparison principle, which they call conditional 
connectedness), but these axioms are not physically obvious, especially axiom 6 and the comparison 
hypothesis. Duistermaat is concerned with general statements about morphisms of order relations, 
thermodynamics being but one application. 

A line of thought that is entirely different from the above starts with Carnot (1824) and 
was amplified in the classics of Clausius and Kelvin (cf. (Kcstin, 1976)) and many others. It 
has dominated most textbook presentations of thermodynamics to this day. The central idea 
concerns cyclic processes and the efficiency of heat engines; heat and empirical temperature enter 
as primitive concepts. Some of the modern developments along these lines go well beyond the 
study of equilibrium states and cyclic processes and use some sophisticated mathematical ideas. A 
representative list of references is Arens (1963), Coleman and Owen (1974, 1977), Coleman, Owen 
and Serrin (1981), Dafermos (1979), Day (1987, 1988), Feinberg and Lavine (1983), Green and 
Naghdi (1978), Gurtin (1975), Man (1989), Owen (1984), Pitteri (1982), Serrin (1979, 1983, 1986), 
Silhavy (1997), Truesdell and Bharata (1977), Truesdell (1980, 1984). Undoubtedly this approach 
is important for the practical analysis of many physical systems, but we neither analyze nor take 
a position on the validity of the claims made by its proponents. Some of these are, quite frankly, 
highly polemical and are of two kinds: claims of mathematical rigor and physical exactness on the 
one hand and assertions that these qualities are lacking in other approaches. See, for example, 
Truesdell's contribution in (Serrin, 1986, Chapter 5). The chief reason we omit discussion of this 
approach is that it docs not directly address the questions we have set for ourselves. Namely, 
using only the existence of equilibrium states and the existence of certain processes that take one 
into another, when can it be said that the list of allowed processes is characterized exactly by the 
increase of an entropy function? 

Finally, we mention an interesting recent paper by Macdonald (1995) that falls in neither of 
the two categories described above. In this paper 'heat' and 'reversible processes' are among the 
primitive concepts and the existence of reversible processes linking any two states of a system 
is taken as a postulate. Macdonald gives a simple definition of entropy of a state in terms of 
the maximal amount of heat, extracted from an infinite reservoir, that the system absorbs in 
processes terminating in the given state. The reservoir thus plays the role of an entropy meter. 
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The fTirther development of the theory along these lines, however, relies on unstated assumptions 
about differentiability of the so defined entropy that are not entirely obvious. 

C. Outline of the paper 

In Section II we formally introduce the relation -< and explain it more fully, but it is to be 
emphasized, in connection with what was said above about an ideal physical theory, that -< has a 
well defined mathematical meaning independent of the physical context in which it may be used. 
The concept of an entropy function, which characterizes this accessibility relation, is introduced 
next; at the end of the section it will be shown to be unique up to a trivial affine transformation 
of scale. We show that the existence of such a function is equivalent to certain simple properties 
of the relation which we call axioms Al to A6 and the 'hypothesis' CH. Any formulation of 
thermodynamics must implicitly contain these axioms, since they are equivalent to the entropy 
principle, and it is not surprising that they can be found in Giles, for example. We do believe that 
our presentation has the virtue of directness and clarity, however. We give a simple formula for the 
entropy, entirely in terms of the relation -< without invoking Carnot cycles or any other gcdankcn 
experiment. Axioms Al to A6 arc highly plausible; it is CH (the comparison hypothesis) that is 
not obvious but is crucial for the existence of entropy. We call it a hypothesis rather than an axiom 
because our ultimate goal is to derive it from some additional axioms. In a certain sense it can 
be said that the rest of the paper is devoted to deriving the comparison hypothesis from plausible 
assumptions. The content of Section II, i.e., the derivation of an entropy function, stands on its 
own feet; the implementation of it via CH is an independent question and we feel it is pedagogically 
significant to isolate the main input in the derivation from the derivation itself. 

Section III introduces one of our most novel contributions. We prove that comparison holds 
for the states inside certain systems which we call simple systems. To obtain it we need a few new 
axioms, SI to S3. These axioms are mainly about mechanical processes, and not about the entropy. 
In short, our most important assumptions concern the continuity of the generalized pressure and the 
existence of irreversible processes. Given the other axioms, the latter is equivalent to Caratheodory's 
principle. 

The comparison hypothesis, CH, does not concern simple systems alone, but also their prod- 
ucts, i.e., compound systems composed of possibly interacting simple systems. In order to compare 
states in different simple systems (and, in particular, to calibrate the various entropies so that they 
can be added together) the notion of a thermal join is introduced in Section IV. This concerns 
states that are usually said to be in thermal equilibrium, but we emphasize that temperature is 
not mentioned. The thermal join is, by assumption, a simple system and, using the zeroth law 
and three other axioms about the thermal join, we reduce the comparison hypothesis among states 
of compound systems to the previously derived result for simple systems. This derivation is an- 
other novel contribution. With the aid of the thermal join we can prove that the multiplicative 
constants of the entropies of all systems can be chosen so that entropy is additive, i.e., the sum 
of the entropies of simple systems gives a correct entropy function for compound systems. This 
entropy correctly describes all adiabatic processes in which there is no change of the constituents 
of compound systems. What remains elusive are the additive constants, discussed in Section VI. 
These are important when changes (due to mixing and chemical reactions) occur. 

Section V establishes the continuous differentiability of the entropy and defines inverse tem- 
perature as the derivative of the entropy with respect to the energy — in the usual way. No new 
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assumptions are needed here. The fact that the entropy of a simple system is determined uniquely 
by its adiabats and isotherms is also proved here. 

In Section VI we discuss the vexed question of comparing states of systems that differ in 
constitution or in quantity of matter. How can the entropy of a bottle of water be compared with 
the sum of the entropies of a container of hydrogen and a container of oxygen? To do so requires 
being able to transform one into the other, but this may not be easy to do reversibly. The usual 
theoretical underpinning here is the use of semi-permeable membranes in a 'van't Hoff box' but 
such membranes are usually far from perfect physical objects, if they exist at all. We examine in 
detail just how far one can go in determining the additive constants for the entropies of different 
systems in the the real world in which perfect semi-permeable membranes do not exist. 

In Section VII we collect all our axioms together and summarize our results briefly. 
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II. ADIABATIC ACCESSIBILITY 

AND CONSTRUCTION OF ENTROPY 

Thermodynamics concerns systems, their states and an order relation among these states. The 
order relation is that of adiabatic accessibility, which, physically, is defined by processes whose 
only net effect on the surroundings is exchange of energy with a mechanical source. The glory of 
classical thermodynamics is that there always is an additive function, called entropy, on the state 
space of any system, that exactly describes the order relation in terms of the increase of entropy. 

Additivity is very important physically and is certainly not obvious; it tells us that the entropy 
of a compound system composed of two systems that can interact and exchange energy with each 
other is the sum of the individual entropies. This means that the pairs of states accessible from 
a given pair of states, which is a far larger set than merely the pairs individually accessible by 
the systems in isolation, is given by studying the sum of the individual entropy functions. This 
is even more surprising when we consider that the individual entropies each have undetermined 
multiplicative constants; there is a way to adjust, or calibrate the constants in such a way that the 
sum gives the correct result for the accessible states — and this can be done once and for all so that 
the same calibration works for all possible pairs of systems. Were additivity to fail we would have 
to rewrite the steam tables every time a new steam engine is invented. 

The other important point about entropy, which is often overlooked, is that entropy not only 
increases, but entropy also tells us exactly which processes are adiabatically possible in any given 
system; states of high entropy in a system are always accessible from states of lower entropy. As 
we shall see this is generally true but it could conceivably fail when there are chemical reactions or 
mixing, as discussed in Section VI. 

In this section we begin by defining these basic concepts more precisely, and then we present 
the entropy principle. Next, we introduce certain axioms, A1-A6, relating the concepts. All these 
axioms are completely intuitive. However, one other assumption — which we call the comparison 
hypothesis — is needed for the construction of entropy. It is not at all obvious physically, but it is an 
essential part of conventional thermodynamics. Eventually, in Sections III and IV, this hypothesis 
will be derived from some more detailed physical considerations. For the present, however, this 
hypothesis will be assumed and, using it, the existence of an entropy function will be proved. Wc 
also discuss the extent to which the entropy function is uniquely determined by the order relation; 
the comparison hypothesis plays a key role here. 

The existence of an entropy function is equivalent to axioms A1-A6 in conjunction with CH, 
neither more nor less is required. The state space need not have any structure besides the one 
implied by the order relation. However, state spaces parametrized by the energy and work coordi- 
nates have an additional, convex structure, which implies concavity of the entropy, provided that 
the formation of convex combination of states is an adiabatic process. We add this requirement as 
axiom A7 to our list of general axioms about the order relation. 

The axioms in this section are so general that they encompass situations where all states 
in a whole neighborhood of a given state are adiabatically accessible from it. Caratheodory's 
principle is the statement that this does not happen for physical thermodynamic systems. In 
contrast, ideal mechanical systems have the property that every state is accessible from every other 
one (by mechanical means alone), and thus the world of mechanical systems will trivially obey the 
entropy principle in the sense that every state has the same entropy. In the last subsection we 
discuss the connection between Caratheodory's principle and the existence of irreversible processes 
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starting from a given state. This principle will again be invoked when, in Section III, we derive the 
comparison hypothesis for simple thermodynamic systems. 

Temperature will not be used in this section, not even the notion of 'hot' and 'cold'. There 
will be no cycles, Carnot or otherwise. The entropy only depends on, and is defined by the order 
relation. Thus, while the approach given here is not the only path to the second law, it has the 
advantage of a certain simplicity and clarity that at least has pedagogic and conceptual value. We 
ask the reader's patience with our syllogisms, the point being that everything is here clearly spread 
out in full view. There are no hidden assumptions, as often occur in many textbook presentations. 

Finally, we hope that the reader will not be confused by our sometimes lengthy asides about 
the motivation and heuristic meaning of our various definitions and theorems. We also hope these 
remarks will not be construed as part of the structure of the second law. The definitions and 
theorems are self-contained, as we state them, and the remarks that surround them are intended 
only as a helpful guide. 

A. Basic concepts 

1. Systems and their state spaces 

Physically speaking a thermodynamic system consists of certain specified amounts of different 
kinds of matter; it might be divisible into parts that can interact with each other in a specified 
way. A special class of systems called simple systems will be discussed in the next chapter. In any 
case the possible interaction of the system with its surroundings is specified. It is a "black box" 
in the sense that we do not need to know what is in the box, but only its response to exchanging 
energy, volume, etc. with other systems. The states of a system to be considered here are always 
equilibrium states, but the equilibrium may depend upon the existence of internal barriers in the 
system. Intermediate, non-equilibrium states that a system passes through when changing from 
one equilibrium state to another will not be considered. The entropy of a system not in equilibrium 
may, like the temperature of such a system, have a meaning as an approximate and useful concept, 
but this is not our concern in this treatment. 

Our systems can be quite complicated and the outside world can act on them in several ways, 
e.g., by changing the volume and magnetization, or removing barriers. Indeed, we are allowed to 
chop a system into pieces violently and reassemble them in several ways, each time waiting for the 
eventual establishment of equilibrium. 

Our systems must be macroscopic, i.e, not too small. Tiny systems (atoms, molecules, DNA) 
exist, to be sure, but we cannot describe their equilibria thermodynamically, i.e., their equilibrium 
states cannot be described in terms of the simple coordinates we use later on. There is a gradual shift 
from tiny systems to macroscopic ones, and the empirical fact is that large enough systems conform 
to the axioms given below. At some stage a system becomes 'macroscopic'; we do not attempt to 
explain this phenomenon or to give an exact rule about which systems are 'macroscopic'. 

On the other hand, systems that are too large arc also ruled out because gravitational forces 
become important. Two suns cannot unite to form one bigger sun with the same properties (the 
way two glasses of water can unite to become one large glass of water) . A star with two solar masses 
is intrinsically different from a sun of one solar mass. In principle, the two suns could be kept apart 
and regarded as one system, but then this would only be a 'constrained' equilibrium because of the 
gravitational attraction. In other words the conventional notions of 'extensivity' and 'intensivity' 
fail for cosmic bodies. Nevertheless, it is possible to define an entropy for such systems by measuring 
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its effect on some standard body. Giles' method is applicable, and our formula (2.20) in Section 
II. E (which, in the context of our development, is used only for calibrating the entropies defined 
by (2.14) in Section II. D, but which could be taken as an independent definition) would allow it, 
too. (The 'nice' systems that do satisfy size-scaling are called 'perfect' by Giles.) The entropy, 
so defined, would satisfy additivity but not extensivity, in the 'entropy principle' of Section II. B. 
However, to prove this would requires a significant enhancement of the basic axioms. In particular, 
we would have to take the comparison hypothesis, CH, for all systems as an axiom — as Giles does. 
It is left to the interested reader to carry out such an extension of our scheme. 

A basic operation is composition of two or more systems to form a new system. Physically, 
this simply means putting the individual systems side by side and regarding them as one system. 
We then speak of each system in the union as a subsystem. The subsystems may or may not 
interact for a while, by exchanging heat or volume for instance, but the important point is that 
a state of the total system (when in equilibrium) is described completely by the states of the 
subsystems. 

Prom the mathematical point of view a system is just a collection of points called a state 
space, usually denoted by F. The individual points of a state space are called states and are 
denoted here by capital Roman letters, X, Y, Z, etc. Prom the next section on we shall build up our 
collection of states satisfying our axioms from the states of certain special systems, called simple 
systems. (To jump ahead for the moment, these are systems with one or more work coordinates 
but with only one energy coordinate.) In the present section, however, the manner in which states 
are described (i.e., the coordinates one uses, such as energy and volume, etc.) are of no importance. 
Not even topological properties are assumed here about our systems, as is often done. In a sense 
it is amazing that much of the second law follows from certain abstract properties of the relation 
among states, independent of physical details (and hence of concepts such as Carnot cycles). In 
approaches like Giles', where it is taken as an axiom that comparable states fall into equivalence 
classes, it is even possible to do without the system concept altogether, or define it simply as an 
equivalence class of states. In our approach, however, one of the main goals is to derive the property 
which Giles takes as an axiom, and systems are basic objects in our axiomatic scheme. 

Mathematically, the composition of two spaces, Pi and P2 is simply the Cartesian product of 
the state spaces Pi x P2. In other words, the states in Pi x P2 are pairs (^1,^2) with Xi € Pi 
and X2 G P2- Prom the physical interpretation of the composition it is clear that the two spaces 
Pi X P2 and P2 X Pi are to be identified. Likewise, when forming multiple compositions of state 
spaces, the order and the grouping of the spaces is immaterial. Thus (Pi x P2) x P3, Pi x (P2 x P3) 
and Pi X P2 X P3 are to be identified as far as composition of state spaces is concerned. Strictly 
speaking, a symbol like {Xi, . . . , Xj\[) with states Xi in state spaces P^, i = 1, . . . ,N thus stands 
for an equivalence class of n-tuples, corresponding to the different groupings and permutations of 
the state spaces. Identifications of this type are not uncommon in mathematics (the formation of 
direct sums of vector spaces is an example). 

A further operation we shall assume is the formation of scaled copies of a given system whose 
state space is P. If i > is some fixed number (the scaling parameter) the state space P^*) consists 
of points denoted tX with X eT. On the abstract level tX is merely a symbol, or mnemonic, to 
define points in P^*), but the symbol acquires meaning through the axioms given later in Sect. IPC. 
In the physical world, and from Sect. Ill onward, the state spaces will always be subsets of some 
R" (parametrized by energy, volume, etc.). In this case tX has the concrete representation as the 
product of the real number t and the vector X G R". Thus in this case P'-*) is simply the image of 
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the set r C R" under scaling by the real parameter t. Hence, we shall sometimes denote F^*) by 

tr. 

Physically, F^*) is interpreted as the state space of a system that has the same properties as 
the system with state space F, except that the amount of each chemical substance in the system 
has been scaled by the factor t and the range of extensive variables like energy, volume etc. has 
been scaled accordingly. Likewise, tX is obtained from X by scaling energy, volume etc., but also 
the matter content of a state X is scaled by the parameter t. From this physical interpretation 
it is clear that s{tX) = {st)X and (F"^*^ = F^**) and we take these relations also for granted 
on the abstract level. The same apples to the identifications F^^) = F and IX = X, and also 
(Fi X F2)W = F^^ X F^*^ and t{X,Y) = {tX,tY). 

The operation of forming compound states is thus an associative and commutative binary 
operation on the set of all states, and the group of positive real numbers acts by the scaling operation 
on this set in a way compatible with the binary operation and the multiplicative structure of the 
real numbers. The same is true for the set of all state spaces. Prom an algebraic point of view the 
simple systems, to be discussed in Section III, are a basis for this algebraic structure. 

While the relation between F and F*^*) is physically and intuitively fairly obvious, there can 
be surprises. Electromagnetic radiation in a cavity ('photon gas'), which is mentioned after (2.6), 
is an interesting case; the two state spaces F and F*^*) and the thermodynamic functions on these 
spaces are identical in this case! Moreover, the two spaces are physically indistinguishable. This 
will be explained in more detail in Section ILB. 

The formation of scaled copies involves a certain physical idealization because it ignores the 
molecular structure of matter. Scaling to arbitrarily small sizes brings quantum effects to the 
fore and macroscopic thermodynamics is no longer applicable. At the other extreme, scaling to 
arbitrarily large sizes brings in unwanted gravitational effects as discussed above. In spite of these 
well known limitations the idealization of continuous scaling is common practice in thermodynamics 
and simplifies things considerably. (In the statistical mechanics literature this goes under the rubric 
of the 'thermodynamic limit'.) It should be noted that scaling is quite compatible with the inclusion 
of 'surface effects' in thermodynamics. This will be discussed in Section III. A. 

By composing scaled copies of N systems with state spaces Fi,...,FAr, one can form, for 
ti, . . . , tjv > 0, their scaled product F^*^-* x • • • x F^'^-' whose points are (tiXi,t2X2, ■ ■ ■ , InXn). 
In the particular case that the Fj's are identical, i.e., Fi = F2 = • • • = F, we shall call any space 
of the the form F(*i) x • • • x F(*'v) multiple scaled copy of P. As will be explained later in 
connection with Eq. (2.11), it is sometimes convenient in calculations to allow t = as scaling 
parameter (and even negative values). For the moment let us just note that if F*-*'^ occurs the 
reader is asked to regard it as the empty set or 'nosystem'. In other words, ignore it. 

Some examples may help clarify the concepts of systems and state spaces. 

(a) Fa- 1 mole of hydrogen, H2. The state space can be identified with a subset of with 
coordinates U (= energy), V{= volume). 

(b) Ffo: I mole of H2. If Fa and F;, are regarded as subsets of R^ then F{, = T^^^'' = {{^U, ^V) : 
(C/,F)GFJ. 

(c) Fc: 1 mole of H2 and h mole of O2 (unmixed). Fc = F^ x F 1 ^ This is a compound 

^ (2 mole O2) 

system. 

(d) Frf: 1 mole of H2O. 
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(e) Fe: 1 mole of H2 + ^ mole of O2 (mixed). Note that 7^ r,^ and 7^ Ff.. This system shows 
the perils inherent in the concept of equilibrium. The system Fe makes sense as long as one 
does not drop in a piece of platinum or walk across the laboratory floor too briskly. Real world 
thermodynamics requires that we admit such quasi-equilibrium systems, although perhaps not 
quite as dramatic as this one. 

(f) Ff. All the equilibrium states of one mole of H2 and half a mole of O2 (plus a tiny bit of 
platinum to speed up the reactions) in a container. A typical state will have some fraction of 
H2O, some fraction of H2 and some O2. Moreover, these fractions can exist in several phases. 

2. The order relation 

The basic ingredient of thermodynamics is the relation 

~< 

of adiabatic accessibility among states of a system — or even different systems. The statement 
X ~<Y, when X and Y are points in some (possibly different) state spaces, means that there is an 
adiabatic transition, in the sense explained below, that takes the point X into the point Y. 

Mathematically, we do not have to ask the meaning of 'adiabatic'. All that matters is that 
a list of all possible pairs of states X's and y's such that X ^ Y is regarded as given. This list 
has to satisfy certain axioms that we prescribe below in subsection C. Among other things it must 
be reflexive, i.e., X ~< X, and transitive, i.e., X ~< Y and Y ~< Z implies X ~< Z. (Technically, 
in standard mathematical terminology this is called a ^reorder relation because we can have both 
X ^ Y and Y ^ X without X = Y.) Of course, in order to have an interesting thermodynamics 
result from our -< relation it is essential that there are pairs of points X, Y for which X ^Y is not 
true. 

Although the physical interpretation of the relation -< is not needed for the mathematical 
development, for applications it is essential to have a clear understanding of its meaning. It is 
difficult to avoid some circularity when defining the concept of adiabatic accessibility. The following 
version (which is in the spirit of Planck's formulation of the second law (Planck, 1926)) appears 
to be sufficiently general and precise and appeals to us. It has the great virtue (as discovered by 
Planck) that it avoids having to distinguish between work and heat — or even having to define the 
concept of heat; heat, in the intuitive sense, can always be generated by rubbing — in accordance 
with Count Rumford's famous discovery while boring cannons! We emphasize, however, that other 
definitions are certainly possible. Our physical definition is the following: 

Adiabatic accessibility: A state Y is adiabatically accessible from a state X, in symbols 
X ~< Y , if it is possible to change the state from X to Y by means of an interaction with some 
device (which may consist of mechanical and electrical parts as well as auxiliary thermodynamic 
systems) and a weight, in such a way that the device returns to its initial state at the end of the 
process whereas the weight may have changed its position in a gravitational field. 

Let us write 

X^^Y if X^Y but Yy^X. (2.1) 

In the real world Y is adiabatically accessible from X only if X Y . When X ^ Y and also 

Y -< X then the state change can only be realized in an idealized sense, for it will take infinitely 
long time to achieve it in the manner decribed. An alternative way is to say that the 'device' that 
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appears in the definition of accessibility has to return to within 'e' of its original state (whatever 
that may mean) and we take the limit e ^ 0. To avoid this kind of discussion we have taken the 
definition as given above, but we emphasize that it is certainly possible to redo the whole theory 
using only the notion of An emphasis on appears in Lewis and Randall's discussion of the 
second law (Lewis and Randall, 1923, page 116). 

Remark: It should be noted that the operational definition above is a definition of the concept 
of 'adiabatic accessibility' and not the concept of an 'adiabatic process'. A state change leading 
from X to y can be achieved in many different ways (usually infinitely many) , and not all of them 
will be 'adiabatic processes' in the usual terminology. Our concern is not the temporal development 
of the state change which, in real processes, always leads out of the space of equilibrium states. Only 
the end result for the system and for the rest of the world interests us. However, it is important to 
clarify the relation between our definition of adiabatic acccssiblity and the usual textbook definition 
of an adiabatic process. This will be discussed in Section C after Theorem 2.1 and again in Sec. 
Ill; cf. Theorem 3.8. There it will be shown that our definition indeed coincides with the usual 
notion based on processes taking place within an 'adiabatic enclosure'. A further point to notice 
is that the word 'adiabatic' is sometimes used to mean "slow" or quasi-static, but nothing of the 
sort is meant here. Indeed, an adiabatic process can be quite violent. The explosion of a bomb in 
a closed container is an adiabatic process. 

Here are some further examples of adiabatic processes: 

1. Expansion or compression of a gas, with or without the help of a weight being raised or lowered. 

2. Rubbing or stirring. 

3. Electrical heating. (Note that the concept of 'heat' is not needed here.) 

4. Natural processes that occur within an isolated compound system after some barriers have 
been removed. This includes mixing and chemical or nuclear processes. 

5. Breaking a system into pieces with a hammer and reassembling (Fig. 1). 

6. Combinations of such changes. 

In the usual parlance, rubbing would be an adiabatic process, but not electrical 'heating', 
because the latter requires the introduction of a pair of wires through the 'adiabatic enclosure'. 
For us, both processes are adiabatic because what is required is that apart from the change of the 
system itself, nothing more than the displacement of a weight occurs. To achieve electrical heating, 
one drills a hole in the container, passes a heater wire through it, connects the wires to a generator 
which, in turn, is connected to a weight. After the heating the generator is removed along with the 
wires, the hole is plugged, and the system is observed to be in a new state. The generator, etc. is 
in its old state and the weight is lower. 

(Insert Figure 1 here) 

We shall use the following terminology concerning any two states X and Y. These states are 
said to be comparable (with respect to the relation of course) if either X ^ Y oi Y -< X. If 
both relations hold we say that X and Y are adiabatically equivalent and write 

X '^Y. (2.2) 

The comparison hypothesis referred to above is the statement that any two states in the same 
state space are comparable. In the examples of systems (a) to (f) above, all satisfy the comparison 
hypothesis. Moreover, every point in Tc is in the relation -< to many (but not all) points in Ta- 
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States in different systems may or may not be comparable. An example of non-comparable systems 
is one mole of H2 and one mole of 02- Another is one mole of H2 and two moles of H2. 

One might think that if the comparison hypothesis, which will be discussed further in Sects. 
II. C and lI.E, were to fail for some state space then the situation could easily be remedied by 
breaking up the state space into smaller pieces inside each of which the hypothesis holds. This, 
generally, is false. What is needed to accomplish this is the extra requirement that comparability 
is an equivalence relation; this, in turn, amounts to saying that the condition X ~< Z and Y ~< Z 
implies that X and Y are comparable and, likewise, the condition Z -< X and Z ^Y implies that 
X and Y are comparable. (This axiom can be found in (Giles, 1964), see axiom 2.1.2, and similar 
requirements were made earlier by Landsberg (1956), Falk and Jung (1959) and Buchdahl (1962, 
1966).) While these two conditions are logically independent, they can be shown to be equivalent if 
the axiom A3 in Section II. C is adopted. In any case, we do not adopt the comparison hypothesis 
as an axiom because we find it hard to regard it as a physical necessity. In the same vein, we do 
not assume that comparability is an equivalence relation (which would then lead to the validity of 
the comparison hypothesis for suitably defined subsystems). Our goal is to prove the comparison 
hypothesis starting from axioms that we find more appealing physically. 

B. The entropy principle 

Given the relation -< for all possible states of all possible systems, we can ask whether this 
relation can be encoded in an entropy function according to the following principle, which expresses 
the second law of thermodynamics in a precise and quantitative way: 

Entropy principle: There is a real-valued function on all states of all systems (including 
compound systems), called entropy and denoted by S such that 

a) Monotonicity : When X and Y are comparable states then 

X ^Y if and only if S{X) < S{Y). (2.3) 

(See (2.6) below.) 

b) Additivity and extensivity: If X and Y are states of some (possibly different) systems 
and if {X, Y) denotes the corresponding state in the composition of the two systems, then the 
entropy is additive for these states, i.e., 

S{{X,Y)) = SiX) + S{Y). (2.4) 

S is also extensive, i.e., for each t > and each state X and its scaled copy tX, 

S{tX) = tS{X). (2.5) 



[Note: Prom now on we shall omit the double parenthesis and write simply S{X, Y) in place of 
S{iX,Y)).] 

A logically equivalent formulation of (2.3), that does not use the word 'comparable' is the 
following pair of statements: 

X AY ^ S{X) = S{Y) and 
X ^^Y ^ S{X) < S{Y). (2.6) 
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The last line is especially noteworthy. It says that entropy must increase in an irreversible process. 

Our goal is to construct an entropy function that satisfies the criteria (2.3)-(2,5), and to show 
that it is essentially unique. We shall proceed in stages, the first being to construct an entropy 
function for a single system, T, and its multiple scaled copies (in which comparability is assumed 
to hold). Having done this, the problem of relating different systems will then arise, i.e., the 
comparison question for compound systems. In the present Section II (and only in this section) we 
shall simply complete the project by assuming what we need by way of comparability. In Section 
IV, the thermal axioms (the zeroth law of thermodynamics, in particular) will be invoked to verify 
our assumptions about comparability in compound systems. In the remainder of this subsection 
we discuss he significance of conditions (2.3)-(2.5). 

The physical content of (2.3) was already commented on; adiabatic processes not only increase 
entropy but an increase of entropy also dictates which adiabatic processes are possible (between 
comparable states, of course). 

The content of additivity, (2.4), is considerably more far reaching than one might think from 
the simplicity of the notation — as we mentioned earlier. Consider four states X, X' ,Y,Y' and 
suppose that X ~< Y and X' -< Y' . Then (and this will be one of our axioms) (X, X') -< {Y,Y'\ 
and (2.4) contains nothing new in this case. On the other hand, the compound system can well 
have an adiabatic process in which {X,X') < ^,Y') but X -^Y . \n this case, (2.4) conveys much 
information. Indeed, by monotonicity, there will be many cases of this kind because the inequality 
S(X) + S(X') < S{Y) + S{Y') certainly docs not imply that S{X) < S{Y). The fact that the 
inequality S{X) + S{X') < S(Y) + S{Y') tells us exactly which adiabatic processes are allowed 
in the compound system (assuming comparability), independent of any detailed knowledge of the 
manner in which the two systems interact, is astonishing and is at the heart of thermodynamics. 

Extensivity, (2.5), is almost a consequence of (2.4) alone — but logically it is independent. 
Indeed, (2.4) implies that (2.5) holds for rational numbers t provided one accepts the notion of 
recombination as given in Axiom A5 below, i.e., one can combine two samples of a system in the 
same state into a bigger system in a state with the same intensive properties. (For systems, such 
as cosmic bodies, that do not obey this axiom, extensivity and additivity are truly independent 
concepts.) On the other hand, using the axiom of choice, one may always change a given entropy 
function satisfying (2.3) and (2.4) in such a way that (2.5) is violated for some irrational t, but 
then the function 1 1— > S(tX) would end up being unbounded in every t interval. Such pathological 
cases could be excluded by supplementing (2.3) and (2.4) with the requirement that S(tX) should 
locally be a bounded function of t, either from below or above. This requirement, plus (2.4), would 
then imply (2.5). For a discussion related to this point see (Giles, 1964), who effectively considers 
only rational t. See also (Hardy, Littlewood, Polya 1934) for a discussion of the concept of Hamel 
bases which is relevant in this context. 

The extensivity condition can sometimes have surprising results, as in the case of electromag- 
netic radiation (the 'photon gas'). As is well known (Landau and Lifschitz, 1969, Sect. 60), the 
phase space of such a gas (which we imagine to reside in a box with a piston that can be used to 
change the volume) is the quadrant F = {([/, V) : < U < oo, < V < oo}. Thus, 

rW = r 

as sets, which is not surprising or even exceptional. What is exceptional is that Sr, which gives 
the entropy of the states in F, satisfies 

Sr{U,V) = (const.) V^/^U^/^. 
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It is homogeneous of first degree in the eoordinates and, therefore, the extensivity law tells us that 
the entropy function on the scaled copy T^^^ is 



5r(t) {U, V) = tSr{U/t, V/t) = Sr{U, V). 



Thus, all the thermodynamic functions on the two state spaces are the same! This unusual situation 
could, in principle, happen for an ordinary material system, but we know of no example besides the 
photon gas. Here, the result can be traced to the fact that particle number is not conserved, as it is 
for material systems, but it does show that one should not jump to conclusions. There is, however, 
a further conceptual point about the photon gas which is physical rather than mathematical. If a 
material system had a homogeneous entropy (e.g., S{U,V) = (const.)yi/2[/i/2 

)we should still be 

able to distinguish T^*) from F, even though the coordinates and entropy were indistinguishable. 
This could be done by weighing the two systems and finding out that one weighs t times as much as 
the other. But the photon gas is different: no experiment can tell the two apart. However, weight 
per se plays no role in thermodynamics, so the difference between the material and photon systems 
is not thermodynamically significant. 

There are two points of view one could take about this anomalous situation. One is to continue 
to use the state spaces F^*) , even though they happen to represent identical systems. This is not 
really a problem because no one said that F'^*) had to be different from F. The only concern is to 
check the axioms, and in this regard there is no problem. We could even allow the additive entropy 
constant to depend on t, provided it satisfies the extensivity condition (2.5). The second point of 
view is to say that there is only one F and no F(*)'s at all. This would cause us to consider the 
photon gas as outside our formalism and to require special handling from time to time. The first 
alternative is more attractive to us for obvious reasons. The photon gas will be mentioned again 
in connection with Theorem 2.5. 

C. Assumptions about the order relation 

We now list our assumptions for the order relation As always, X, Y, etc. will denote states 
(that may belong to different systems), and if X is a state in some state space F, then tX with 
f > is the corresponding state in the scaled state space F^*) . 
Al) Refiexivity. X ^X. 

A2) Transitivity. X and Y ^ Z implies X ^ Z. 
A3) Consistency. X ^ X' andY ~< Y' implies {X,Y) -< (X',F')- 
A4) Scaling invariance. IfX^Y, then tX -< tY for all t > 0. 
A5) Splitting and recombination. For < t < 1 



X A(tX,{l-t)X). 



(2.7) 



A6) 



(If X G F, then the right side is in the scaled product F^*) X F^-'^ *), of course.) 
Stability. //, for some pair of states, X and Y , 



{X,eZo)^{Y,sZ-,) 



holds for a sequence of e's tending to zero and some states Zq, Z-i, then 



X <Y. 
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Remark: 'Stability' means simply that one cannot increase the set of accessible states with an 
infinitesimal grain of dust. 

Besides these axioms the following property of state spaces, the 'comparison hypothesis', plays 
a crucial role in our analysis in this section. It will eventually be established for all state spaces 
after we have introduced some more specific axioms in later sections. 
CH) Definition: We say the comparison hypothesis ( CH) holds for a state space if any two 

states X and Y in the space are comparable, i.e., X ^ Y or Y ^ X. 

In the next subsection we shall show that, for every state space, F, assumptions A1-A6, and 
CH for all two-fold scaled products, (1 — A)r x AF, not just F itself, are in fact equivalent to the 
existence of an additive and extensive entropy function that characterizes the order relation on the 
states in all scaled products of F. Moreover, for each F, this function is unique, up to an affine 
transformation of scale, S{X) aS{X) + B. Before we proceed to the construction of entropy we 
derive a simple property of the order relation from assumptions A1-A6, which is clearly necessary 
if the relation is to be characterized by an additive entropy function. 

THEOREM 2.1 (Stability implies cancellation law). Assume properties A1-A6, espe- 
cially A 6 — the stability law. Then the cancellation law holds as follows. If X,Y and Z are states 
of three (possibly distinct) systems then 

{X,Z)~<{Y,Z) implies X ~<Y (Cancellation Law). 



Proof: Let e = 1/n with n = 1, 2, 3, Then we have 

{X,eZ) A{{i-£)x,eX,eZ) (by A5) 

~< ((1 - e)X, eY, eZ) (by Al, A3 and A4) 

^ ((1 - 2e)X, eX, eY, eZ) (by A5) 

~< ((1 - 2e)X, 2eY, eZ) (by Al, A3, A4 and A5). 

By doing this n = 1/e times we find that (X, eZ) -< {Y, eZ) . By the stability axiom A6 we then 
have X <Y. ■ 
Remark: Under the additional assumption that Y and Z are comparable states (e.g., if they 
are in the same state space for which CH holds), the cancellation law is logically equivalent to the 
following statement (using the consistency axiom A3): 

If X ^^Y then {X, Z) « {Y, Z) for all Z. 

The cancellation law looks innocent enough, but it is really rather strong. It is a partial 
converse of the consistency condition A3 and it says that although the ordering in Fi x F2 is not 
determined simply by the order in Fi and F2, there are limits to how much the ordering can vary 
beyond the minimal requirements of A3. It should also be noted that the cancellation law is in 
accord with our physical interpretation of the order relation in Subsection II. A. 2.; a "spectator", 
namely Z, cannot change the states that are adiabatically accessible from X. 
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Remark about 'Adiabatic Processes': With the aid of the cancellation law wc can now dis- 
cuss the connection between our notion of adiabatic accessibility and the textbook concept of an 
'adiabatic process'. One problem we face is that this latter concept is hard to make precise (this 
was our reason for avoiding it in our operational definition) and therefore the discussion must 
necssearily be somewhat informal. The general idea of an adiabatic process, however, is that the 
system of interest is locked in a thermally isolating enclosure that prevents 'heat' from flowing into 
or out of our system. Hence, as far as the system is concerned, all the interaction it has with the 
external world during an adiabatic process can be thought of as being accomplished by means of 
some mechanical or electrical devices. Our operational definition of the relation -< appears at first 
sight to be based on more general processes, since we allow an auxilary thermodynamical system 
as part of the device. We shall now show that, despite appearances, our definition coincides with 
the conventional one. 

Let us temporarily denote by -<* the relation between states based on adiabatic processes, i.e., 
X ^* Y ii and only if there is a mechanical/electrical device that starts in a state M and ends up in 
a state M' while the system changes from X to Y. We now assume that the mechanical/electrical 
device can be restored to the initial state M from the final state M' by adding or substracting 
mechanical energy, and this latter process can be reduced to the raising or lowering of a weight in a 
gravitational field. (This can be taken as a definition of what we mean by a 'mechanical/electrical 
device'. Note that devices with 'dissipation' do not have this property.) Thus, X ^* Y means 
there is a process in which the mechanical/electrical device starts in some state M and ends up in 
the same state, a weight moves from height h to height h' , while the state of our system changes 
from X to Y. In symbols, 

{X,M,h) — ^{Y,M,h'). (2.8) 

In our definition of adiabatic accessibility, on the other hand, we have some arbitrary device, 
which interacts with our system and which can generate or remove heat if desired. There is no 
thermal enclosure. The important constraint is that the device starts in some state D and ends up 
in the same state D. As before a weight moves from height h to height h' , while our system starts 
in state X and ends up in state Y. In symbols, 

iX,D,h)^{Y,D,h') (2.9). 

It is clear that (2.8) is a special case of (2.9), so we conclude that X -<* Y implies X ~< Y. The 
device in (2.9) may consist of a thermal part in some state Z and electrical and mechanical parts 
in some state M. Thus D = {Z, M), and (2.9) clearly implies that {X, Z) -<* (F, Z). 

It is natural to assume that ^* satisfies axioms A1-A6, just as -< does. In that case we can 
infer the cancellation law for -<*, i.e., {X,Z) -<* {Y,Z,) implies X ■<* Y. Hence, X <Y (which is 
what (2.9) says) implies X -<* Y . Altogether we have thus shown that -< and -<* are really the same 
relation. In words: adiabatic accessibility can always be achieved by an adiabatic process applied 
to the system plus a device and, furthermore, the adiabatic process can be simplified ( although this 
may not be easy to do experimentally) by eliminating all thermodynamic parts of the device, thus 
making the process an adiabatic one for the system alone. 



22 



D. The construction of entropy for a single system 

Given a state space T we may, as discussed in Subsection I.A.I., construct its multiple scaled 
copies, i.e., states of the form 

Y={tiYi,...,tNYN) 

with ti > 0, Yi e r. It follows from our assumption A5 that if CH (comparison hypothesis) holds 

in the state space F*^*^^ x ■ ■ ■ x r^*'^) with ti, ...,tN fixed, then any other state of the same form, 
Y' = {t[Y(, . . . ,t'j^jYlj) with y/ G r , is comparable to Y provided ti = '^jt'j (but not, in 
general, if the sums are not equal). This is proved as follows for N = M = 2; the easy extension 
to the general case is left to the reader. Since ti + t2 = t[ + t'2 we can assume, without loss of 
generality, that ti — t'i = t2~*2 > 0, because the case ti —t'l = is already covered by CH (which was 
assumed) for r(*i) x r^*^). By the splitting axiom, A5, wc have {hYi^tiYi) ~ (t'lYi, (ti -tDYi, t2^2) 
and (t'l i^) ^2^2) (^'1^1') (*i - ^'1)^2') ^2 5^2)- The comparability now follows from CH on the space 

r(*i) X r(*i-*'i) x r(*2). 

The entropy principle for the states in the multiple scaled copies of a single system will now 
be derived. More precisely, we shall prove the following theorem: 

THEOREM 2.2 (Equivalence of entropy and assumptions A1-A6, CH). Let T he a 

state space and let -< be a relation on the multiple scaled copies ofT. The following statements are 
equivalent. 

(1) The relation < satisfies axioms A1-A6, and CH holds for all multiple scaled copies ofV. 

(2) There is a function, Sr on T that characterizes the relation in the sense that if ti + ■ ■ ■ + tN = 
t'l H \-t'j^, (for all N >1 and M > 1) then 

{tiYi,...,tNYM) -< {t[Yl,...,t'MYi,) 

holds if and only if 

N M 

J2tiSr{Yi) < Y.t'MY^) ■ (2-10) 

i=l j=l 

The function Sr is uniquely determined on T, up to an affine transformation, i.e., any other 
function on T satisfying (2.10) is of the form S'r(^) = aSriX) + B with constants a > and 
B. 

Definition. A function St on V that characterizes the relation -< on the multiple scaled copies 
of r in the sense stated in the theorem is called an entropy function on V. 

We shall split the proof of Theorem 2.2 into Lemmas 2.1, 2.2, 2.3 and Theorem 2.3 below. 

At this point it is convenient to introduce the following notion of generalized ordering. 
While (aiXi, 02-^2; • • • ,o,nXn) has so far only been defined when all Oj > 0, we can define the 
meaning of the relation 

{a^X^,...,aNXN)^{a[X[,...,a'MX'M) (2.11) 

for arbitrary G R, G R, A?^ and M positive integers and Xi G Fj, X^ G T[ as follows. If any 
tti (or a^) is zero we just ignore the corresponding term. Example: {0Xi,X2) -< (2X3,0X4) means 
the same thing as X2 -< 2X3. If any (or is negative, just move OjXj (or a^X|) to the other 
side and change the sign of Oj (or a[). Example: 

(2X1, X2) ~< (X3,-5X4,2X5,X6) 
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means that 

(2Xi, 5X4,^2) ~< {X3,2X5,Xe) 

in r^^-* X r^^-" X r2 and T3 x x Tq. (Recall that x = x r^). It is easy to check, using 
the cancellation law, that the splitting and recombination axiom A 5 extends to nonpositive scaling 
parameters, i.e., axioms A1-A6 imply that X ~ (aX,bX) for all a, 6 G R with a + 6 = 1, if the 
relation -< for nonpositive a and b is understood in the sense just decribed. 

For the definition of the entropy function we need the following lemma, which depends crucially 
on the stability assumption A6 and on the comparison hypothesis CH for the state spaces F^^"-^) x 

LEMMA 2.1 Suppose Xq and Xi are two points in F with Xq -<~< X-^. For A G R define 

Sx = {X gT -.{(1- A)Xo, AXi) -< X}. (2.12) 

Then 

(i) For every AT € F there is a A € R such that X G S\. 
(a) For every X G F, sup{A : X € Sx} < 00. 

Remark. Since AT A ((1 — A)X, XX) by assumption A5, the definition of 5^ really involves the 
order relation on double scaled copies of F (or on F itself, if A = or 1.) 

Proof of Lemma 2.1. (i) li Xq ^ X then obviously X G iSq by axiom A2. For general X we 
claim that 

{l + a)Xo<{aX^,X) (2.13) 

for some a > and hence ((1 — A)Xo, AXi) -< X with A = — a. The proof relies on stability, A6, 
and the comparison hypothesis CH (which comes into play for the first time): If (2.13) were not 
true, then by CH we would have 

{aX^,X) ^ (l + a)A:o 
for all a > and so, by scaling, A4, and A5 

Xi, —X] -< [Xq, —Xq 
a J \ a 

By the stability axiom A6 this would imply Xi -< Xq in contradiction to Xq Xi. 

(ii) If sup{A : X £ Sx} = 00, then for some sequence of A's tending to infinity we would 
have ((1 - A)Xo,AX) -< X and hence (Xo,AXi) -< (X, AXq) by A3 and A5. By A4 this implies 
(iXo,Xi) -< (iX,Xo) and hence Xi -< Xq by stability, A6. ■ 

We can now state our formula for the entropy function. If all points in F are adiabatically 
equivalent there is nothing to prove (the entropy is constant), so we may assume that there are 
points Xo, Xi e F with Xq ^-<; Xi. We then define for X G F 

5r(X) :=sup{A : ((1 - A)Xo, AXi) ^ X}. (2.14) 

(The symbol a := b means that a is defined by b.) This Sr will be referred to as the canonical 
entropy on F with reference points Xq and Xi. This definition is illustrated in Figure 2. 

Insert Figure 2 here 
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By Lemma 2.1 Sr{X) is well defined and SriX) < oo for all X. (Note that by stability we 
could replace ^ by -<-< in (2.14).) We shall now show that this S'r has all the right properties. The 
first step is the following simple lemma, which does not depend on the comparison hypothesis. 

LEMMA 2.2 (-< is equivalent to <). Suppose Xq -<-< Xi are states and ao,ai,aQ,a[ are 
real numbers with oq + ai = Oq + a[ . Then the following are equivalent, 
(i) (aoXo,aiXi) -< {a'^Xo^a'^Xi) 
(a) Oi < a'j (and hence oq > a'^). 

In particular, ~ holds in (i) if and only if ai = a'^ and oq = Oq. 

Proof: We give the proof assuming that the numbers oq, ai, Oq, a'^ are all positive and ao + ai = 
+ a'l = 1. The other cases are similar. We write ai = A and a'l = A'. 

(i) ^ (ii). If A > A' then, by A5 and A3, ((1 - A)Ao, A'Ai, (A - A')Ai) ^ ((1 - A)Ao, (A - 
A')A'o, A'Ai). By the cancellation law, Theorem 2.1, ((A - A')Ai) -< ((A - A')Ao). By scaling 
invar iance, A5, Xi ^ Xq, which contradicts Xq Xi. 
(ii) =^ (i). This follows from the following computation. 

((1 - A)Xo, \Xi) A ((1 - A')Xo, (A' - A)Xo, \Xi) (by axioms A3 and A5) 
~< ((1 - \')Xo, (A' - A)Xi, AXi) (by axioms A3 and A4) 
~ ((1 - A')Xo, A'Xi) (by axioms A3 and A5). 

■ 

The next lemma will imply, among other things, that entropy is unique, up to an affine 
transformation. 

LEMMA 2.3 (Characterization of entropy). Let Sr denote the canonical entropy (2.14) 
on r with respect to the reference points Xq -<~< Xi. If X eT then the equality 

X = St{X) 

is equivalent to 

X^((l- A)Xo,AXi). 

Proof: First, if A = Sr{X) then, by the definition of supremum, there is a sequence ei > £2 > 
. . . > converging to zero, such that 

((l-(A-£„))Xo,(A-£„)Xi)^X 

for each n. Hence, by A5, 

((1 - A)Ao, XX,,enXo) ^ ((1 - A + Sn)Xo, (A - e„)Ai, £„Xi) -< {X, £„Xi), 

and thus ((1 — X)Xo,XXi) -< X by the stability property A6. On the other hand, since A is the 
supremum we have 

((l-(A + £)Xo,(A + £)Xi) 
for all £ > by the comparison hypothesis CH. Thus, 

{X,eXo) ~< ((1 - X)Xo,XXi,eXi), 
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so, by A6, X -<{{!- X)Xo,XXi). This shows that X ^ ((1 - A)Xo, XXi) when A = Sr{X). 

Conversely, if A' G [0,1] is such that X ^ ((1 - X')Xo,X'Xi), then ((1 - X')Xo,X'Xi) A 
((1 — X)Xq, XXi) by transitivity. Thus, A = A' by Lemma 2.2. ■ 

Remark 1: Without the comparison hypothesis we could find that St{Xq) = and St{X) = 1 
for all X such that Xq ^ X. 

Remark 2: Prom Lemma 2.3 and the cancellation law it follows that the canonical entropy 
with reference points Xq X^ satisfies < -S'r(-'^) < 1 if and only if X belongs to the strip 
S(Xo,Xi) defined by 

S(Xo,Xi) := {X G r : Xo ^ X ^ Xi} C r. 
Let us make the dependence of the canonical entropy on Xq and Xi explicit by writing 

5r(X) = 5r(X|Xo,Xi) . (2.15) 

For X outside the strip we can then write 

5r(X|Xo,Xi) = Sr(Xi|Xo,X)-i if Xi ^ X 

and 

Proof of Theorem 2.2: 

(1) ^ (2): Put A, = SriYi), X[ = 5r By Lemma 2.3 we know that Yi ^ ((1 - 
Aj)Xo,AjXi) and y/ ((1 — A9Xo,A^Xi). By the consistency axiom A3 and the recombination 
axiom A5 it follows that 

{tiYu . . . , tNYj,) A ti{l - Ai)Xo, ^ tiXiX,) 

i i 

and 

i i 

Statement (2) now follows from Lemma 2.2. The implication (2) =^ (1) is obvious. 

The proof of Theorem 2.2 is now complete except for the uniqueness part. Wc formulate this 
part separately in Theorem 2.3 below, which is slightly stronger than the last assertion in Theorem 
2.2. It implies that an entropy function for the multiple scaled copies of T is already uniquely 
determined, up to an affine transformation, by the relation on states of the form ((1 — X)X,XY), 
i.e., it requires only the case AT = M = 2, in the notation of Theorem 2.2. 

THEOREM 2.3 (Uniqueness of entropy) // is a function on F that satisfies 

((l-A)X,Ay)^((l-A)X',Ay') 

if and only if 

(1 - x)s^{x) + A5f (y) < (1 - A)5f (X') + A5;;(y'), 

for all A G R and X, Y, X',Y' e T, then 

S^{X) = aSriX) + B 
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with 

a = S^{Xi) - S^iXo) > 0, B = S^{Xo). 

Here Sr is the canonical entropy on F with reference points Xq -<~< Xi. 

Proof: This follows immediately from Lemma 2.3, which says that for every X there is a unique 
A, namely A = Sr{X), such that 

X ^ ((1 - X)X, XX) A ((1 _ A)Xo, XX,). 

Hence, by the hypothesis on S^, and A = Sr{X), we have 

S^X) = (1 - X)SUXo) + XS^{Xi) = [S^Xr) - S^{Xo)]Sr{X) + S^Xo). 

The hypothesis on also implies that a := S^{Xi) — S^{Xo) > 0, because Xq -<-< Xi. ■ 

Remark: Note that is defined on T and satisfies S^{X) = aSr{X) + B there. On the space 
r^*) a corresponding entropy is, by definition, given by S^^t~^{tX) = tS^iX) = atSr{X) + tB = 

aS^\tX) + tB, where S^\tX) is the canonical entropy on F^*) with reference points tXQ,tXi. 
Thus, S*^,^{tX) / aS^j^\tX) + B (unless B = 0, oi course). 

It is apparent from formula (2.14) that the definition of the canonical entropy function on F 
involves only the relation -< on the double scaled products F^^"^) x F'^^) besides the reference points 
Xq and Xi. Moreover, the canonical entropy uniquely characterizes the relation on all multiple 
scaled copies of F, which implies in particular that CH holds for all multiple scaled copies. Theorem 
2.3 may therefore be rephrased as follows: 

THEOREM 2.4 (The relation on double scaled copies determines the relation 

everywhere). Let -< and -<* be two relations on the multiple scaled copies ofV satisfying axioms 
A1-A6, and also CH for T^'^'^^ xT^^^ for each fixed X e [0,1]. If ^ and ^* coincide on T^'^'^^ xT^^^ 
for each X £ [0, 1], then -< and -<* coincide on all multiple scaled copies ofT, and CH holds on all 
the multiple scaled copies. 

The proof of Theorem 2.2 is now complete. 
E. Construction of a universal entropy in the absence of mixing 

In the previous subsection we showed how to construct an entropy for a single system, F, that 
exactly describes the relation -< within the states obtained by forming multiple scaled copies of F. 
It is unique up to a multiplicative constant a > and an additive constant B, i.e., to within an 
affine transformation. We remind the reader that this entropy was constructed by considering just 
the product of two scaled copies of F, but our axioms implied that it automatically worked for all 
multiple scaled copies of F. We shall refer to a and B as entropy constants for the system F. 

Our goal is to put these entropies together and show that they behave in the right way on 
products of arbitrarily many copies of different systems. Moreover, this 'universal' entropy will be 
unique up to one multiplicative constant — but still many additive constants. The central question 
here is one of 'calibration' , which is to say that the multiplicative constant in front of each ele- 
mentary entropy has to be chosen in such a way that the additivity rule (2.4) holds. It is not even 
obvious yet that the additivity can be made to hold at all, whatever the choice of constants. 
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Let us note that the number of additive constants depends heavily on the kinds of adiabatic 
processes available. The system consisting of one mole of hydrogen mixed with one mole of helium 
and the system consisting of one mole of hydrogen mixed with two moles of helium are different. The 
additive constants are independent unless a process exists in which both systems can be unmixed, 
and thereby making the constants comparable. In nature we expect only 92 constants, one for each 
element of the periodic table, unless we allow nuclear processes as well, in which case there are 
only two constants (for neutrons and for hydrogen). On the other hand, if un-mixing is not allowed 
uncountably many constants are undetermined. In Section VI we address the question of adiabatic 
processes that unmix mixtures and reverse chemical reactions. That such processes exist is not so 
obvious. 

To be precise, the principal goal of this subsection is the proof of the following Theorem 2.5, 
which is a case of the entropy principle that is special in that it is restricted to processes that do 
not involve mixing or chemical reactions. It is a generalization of Theorem 2.2. 

THEOREM 2.5 (Consistent entropy scales). Consider a family of systems fulfilling the 
following requirements: 

(i) The state spaces of any two systems in the family are disjoint sets, i.e., every state of a system 

in the family belongs to exactly one state space, 
(a) All multiple scaled products of systems in the family belong also to the family. 
(Hi) Every system in the family satisfies the comparison hypothesis. 

For each state space T of a system in the family let Sr be some d efinite entropy function on 
r. Then there are constants ar and By such that the function S, defined for all states in all V 's by 

S{X) = ar5r(X) + 5r 

for X G r, has the following properties: 

a) . If X and Y are in the same state space then 

X^Y if and only if S{X) < S{Y). 

b) . S is additive and extensive, i.e., 

S{X,Y) = S{X) + S{Y). (2.4) 

and, for t > 0, 

S{tX) = tS{X). (2.5) 



Remark. Note that Fi and Fi x F2 are disjoint as sets for any (nonempty) state spaces Fi and 

F2. 

Proof: Fix some system Fq and two points Zq Zi in Fq. In each state space F choose some 
fixed point Xr G F in such a way that the identities 

XrM = {Xr„Xr,) (2.16) 
Xtr = tXr (2.17) 
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hold. With the aid or the axiom of choice this can be achieved by considering the formal vector 
space spanned by all systems and choosing a Hamcl basis of systems {r„} in this space such that 
every system can be written uniquely as a scaled product of a finite number of the Fq, 's. (See Hardy, 
Littlewood and Polya, 1934). The choice of an arbitrary state Xr„ in each of these 'elementary' 
systems Fq, then defines for each F a unique Xr such that (2.17) holds. (If the reader does not 
wish to invoke the axiom of choice then an alternative is to hypothesize that every system has a 
unique decomposition into elementary systems; the simple systems considered in the next section 
obviously qualify as the elementary systems.) 

For X eV we consider the space F x Fq with its canonical entropy as defined in (2.14), (2.15) 
relative to the points {Xr, Zq) and {Xr, Z\). Using this function we define 

S{X) = Sr^r,{{X,Z^) \ {Xr,Z^),{Xr,Z^)). (2.18) 

Note: Equation (2.18) fixes the entropy of Xr to be zero. 

Let us denote S{X) by A which, by Lemma 2.3, is characterized by 

{X, Zo) A ((1 - \){Xr, Zo), X{Xr, Z{)). 

By the cancellation law this is equivalent to 

{X,\Z^)^{Xr,\Z^)). (2.19) 

By (2.16) and (2.17) this immediately implies the additivity and extensivity of S. Moreover, 
since X <Y holds if and only if {X, Zq) -< {Y, Zq) it is also clear that S is an entropy function on 
any F. Hence S and Sr are related by an affine transformation, according to Theorem 2.3. ■ 

Definition (Consistent entropies). A collection of entropy functions Sr on state spaces 
F is called consistent if the appropriate linear combination of the functions is an entropy function 
on all multiple scaled products of these state spaces. In other words, the set is consistent if the 
multiplicative constants ap, referred to in Theorem 2.5, can all be chosen equal to 1. 

Important Remark: From the definition, (2.14), of the canonical entropy and (2.19) it follows 
that the entropy (2.18) is given by the formula 

5(X) = sup{A : (Xr,AZi) X (X,AZo)} (2.20) 

for X e T. The auxiliary system Fq can thus be regarded as an 'entropy meter' in the spirit of 

(Lewis and Randall, 1923) and (Giles, 1964). Since we have chosen to define the entropy for each 
system independently, by equation (2.14), the role of Fq in our approach is solely to calibrate the 
entropy of different systems in order to make them consistent. 

Remark about the photon gas: As we discussed in Section II.B the photon gas is special and 
there are two ways to view it. One way is to regard the scaled copies F^*) as distinct systems and 
the other is to say that there is only one F and the scaled copies are identical to it and, in particular, 
must have exactly the same entropy function. We shall now see how the first point of view can be 
reconciled with the latter requirement. Note, first, that in our construction above we cannot take 
the point {U, V) = (0, 0) to be the fiducial point Xr because (0, 0) is not in our state space which, 
according to the discussion in Section III below, has to be an open set and hence cannot contain 
any of its boundary points such as (0,0). Therefore, we have to make another choice, so let us 
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take Xr = (1, 1). But the construction in tlic proof above sets S'r(l, 1) = and therefore Sr{U, V) 
will not have the homogeneous form S^'°'''{U,V) = V^/^U'^^^. Nevertheless, the entropies of the 
scaled copies will be extensive, as required by the theorem. If one feels that all scaled copies should 
have the same entropy (because they represent the same physical system) then the situation can be 
remedied in the following way: With Sr{U, V) being the entropy constructed as in the proof using 
(1, 1), we note that SriU, V) = S''^°"(C/, V) + Br with the constant Br given by Br = -S'r(2, 2). 
This follows from simple algebra and the fact that we know that the entropy of the photon gas 
constructed in our proof must equal S^°™ to within an additive constant. (The reader might ask 
how we know this and the answer is that the entropy of the 'gas' is unique up to additive and 
multiplicative constants, the latter being determined by the system of units employed. Thus, the 
entropy determined by our construction must be the 'correct entropy', up to an additive constant, 
and this 'correct entropy' is what it is, as determined by physical measurement. Hopefully it agrees 
with the function deduced in (Landau and Lifschitz, 1969).) Let us use our freedom to alter the 
additive constants as we please, provided we maintain the extensivity condition (2.5). It will not be 
until Section VI that we have to worry about the additive constants per se because it is only there 
that mixing and chemical reactions are treated. Therefore, we redefine the entropy of the state space 
r of the photon gas to be S*iU, V) := Sr{U, V) + Sr{2, 2). which is the same as 5'^°"(C/, V). We 
also have to alter the entropy of the scaled copies according to the rule that preserves extensivity, 
namely SrwiU,V) SrwiU,V) + tSri2,2) = Sr(t)(U,V) + Srw{2t,2t) = S^°'^(U,V). In this 
way, all the scaled copies now have the same (homogeneous) entropy, but we remind the reader 
that the same construction could be carried out for any material system with a homogeneous (or, 
more exactly an affine) entropy function — if one existed. From the thermodynamic viewpoint, the 
photon gas is unusual but not special. 



F. Concavity of entropy 

Up to now we have not used, or assumed, any geometric property of a state space F. It is an 
important stability property of thermodynamical systems, however, that the entropy function is a 
concave function of the state variables — a requirement that was emphasized by Maxwell, Gibbs, 
Callen and many others. Concavity also plays an important role in the definition of temperature, 
as in section V. 

In order to have this concavity it is first necessary to make the state space on which entropy 
is defined into a convex set, and for this purpose the choice of coordinates is important. Here, 
we begin the discussion of concavity by discussing this geometric property of the underlying state 
space and some of the consequences of the convex combination axiom A7 for the relation ^ , to be 
given after the following definition. 

Definition: By a state space with a convex structure, or simply a convex state space, 
we mean a state space F, that is a convex subset of some linear space, e.g., R". That is, if X and 
Y are any two points in F and if < t < 1, then the point tX + {1 — t)Y is a well-defined point in 
F. A concave function, S, on F is one satisfying the inequality 

S{tX + (1 - t)Y) > tS{X) + (1 - t)SiY). (2.21) 

Our basic convex combination axiom for the relation -< is the following. 
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A7) Convex combination. Assume X and Y are states in the same convex state space, T. For 
t G [0, 1] let tX and (1 — t)Y be the corresponding states of their t scaled and (1 — t) scaled 
copies, respectively. Then the point {tX, (1 — t)Y) in the product space rW X r(i-*) satisfies 

{tX, (1 - t)Y) ^tX + {l- t)Y . (2.22) 

Note that the right side of (2.22) is in F and is defined by ordinary convex combination of 
points in the convex set F. 

The physical meaning of A7 is more or less evident, but it is essential to note that the convex 
structure depends heavily on the choice of coordinates for F. A7 means that if we take a bottle 
containing 1/4 moles of nitrogen and one containing 3/4 moles (with possibly different pressures 
and densities), and if we mix them together, then among the states of one mole of nitrogen that 
can be reached adiabatically there is one in which the energy is the sum of the two energies and, 
likewise, the volume is the sum of the two volumes. Again, we emphasize that the choice of energy 
and volume as the (mechanical) variables with which we can make this statement is an important 
assumption. If, for example, temperature and pressure were used instead, the statement would not 
only not hold, it would not even make much sense. 

The physical example above seems not exceptionable for liquids and gases. On the other hand 
it is not entirely clear how to ascribe an operational meaning to a convex combination in the state 
space of a solid, and the physical meaning of axiom A7 is not as obvious in this case. Note, however, 
that although convexity is a global property, it can often be inferred from a local property of the 
boundary. (A connected set with a smooth boundary, for instance, is convex if every point on the 
boundary has a neighbourhood, whose intersection with the set is convex.) In such cases it suffices 
to consider convex combinations of points that are close together and close to the boundary. For 
small deformation of an isotropic solid the six strain coordinates, multiplied by the volume, can 
be taken as work coordinates. Thus, A7 amounts to assuming that a convex combination of these 
coordinates can always be achieved adiabatically. See, e.g., (Callen, 1985). 

If X G F we denote by Ax the set {Y : X ^Y}. Ax is called the forward sector of X 
in F. More generally, if F' is another system, we call the set 

{Y -.x < y}, 

the forward sector of X inV . 

Usually this concept is applied to the case in which F and F' are identical, but it can also 
be usebil in cases in which one system is changed into another; an example is the mixing of two 
liquids in two containers (in which case F is a compound system) into a third vessel containing the 
mixture (in which case F' is simple). 

The main effect of A7 is that forward sectors are convex sets. 

THEOREM 2.6 (Forward sectors are convex). Let F and F' he state spaces of two 
systems, with F' a convex state space. Assume that A1-A5 hold for V andV and, in addition, A7 
holds for F'. Then the forward sector of X in F' , defined above, is a convex subset ofV for each 
X G F. 

Proof: Suppose X < Yi and X < Y2 and that < i < 1. We want to show that X -< 
tYi + (1 — t)Y2. (The right side defines, by ordinary vector addition, a point in the convex set F'. ) 
First, X ~< {tX, il-t)X) e FW xF(i-*), by axiom A5. Next, {tX, (l-t)X) ~< {tYi, (l-t)i2) by the 
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consistency axiom A3 and the scaling invariance axiom A4. Finally, {tYi, {l — t)Y2) -< tYi + {l — t)Y2 
by the convex combination axiom A7. I 



Figure 3 illustrates this theorem in the case F = F'. 

Insert Figure 3 here 

THEOREM 2.7 (Convexity of Sx). Let the sets Sx CT be defined as in (2.12) and assume 
the state space F satisfies the convex combination axiom A7 in addition to A1-A5. Then: 

(i) Sx is convex. 

(ii) IfXeSx,,Ye Sx, andO<t<l, then tX + {1 - t)Y e Stx,+a-t)x, ■ 

Proof, (i) This follows immediately from the scaling, splitting and convex combination axioms 
A4, A5 and A7. 

(ii) This is proved by splitting, moving the states of the subsystems into forward sectors and 
bringing the subsystems together at the end. More precisely, defining A = tXi + (1 — t)X2 we have 
to show that ((1 - A)Xo, XXi) ^tX + {l- t)Y. Starting with ((1 - X)Xo,XXi) we split (1 - X)Xo 
into {t{l - Xi)Xq, (1 - t)(l - A2)Xo) and XXi into {tXiXi, (1 - t)X2Xi). Next we consider the 
states (t(l — Xi)XQ,tXiXi) and ((1 — — X2)Xq, (1 — t)X2Xi). By scaling invariance A4 and 
the splitting property A5 we can pass from the former to (i(l — Xi)X,tXiX) and from the latter 
to ((1 - t){l - X2)Y, (1 - t)X2Y). Now we combine the parts of (t(l - Xi)X,tXiX) to obtain tX 
and the parts of ((1 — t){l — X2)Y, (1 — t)X2Y) to obtain (1 — t)Y, and finally we use the convex 
combination property A7 to reach tX + {1 — t)Y. ■ 



THEOREM 2.8 (Concavity of entropy). Let T be a convex state space. Assume axiom 
A7 in addition to A1-A6, and CH for multiple scaled copies ofT. Then the entropy Sr defined by 
(2.14) is a concave function on F. Conversely, if Sr is concave, then axiom A7 necessarily holds 
a-fortiori. 

Proof: If X G 5A,,y G Sx,, then by Theorem 2.7, (ii), tX + {I - t)Y G cSu,+(i-t)A,, for 
i;Ai,A2 G [0,1]. By definition, this implies S'r(t^+(l—i)l") > tAi + (l— i)A2. Taking the suprcmum 
over ah Ai and A2 such that X G Sx^,Y ^ Sx,, then gives SritX+{l-t)Y) > tSr{X) + {l-t)SriY). 
The converse is obvious. ■ 

G. Irreversibility and Caratheodory's principle 

One of the milestones in the history of the second law is Caratheodory's attempt to formulate 
the second law in terms of purely local properties of the equivalence relation The disadvantage 
of the purely local formulation is, as we said earlier, the difliculty of deriving a globally defined 
concave entropy function. Additionally, Caratheodory relies on differentiability (differential forms), 
and we would like to avoid this, if possible, because physical systems do have points (e.g., phase 
transitions) in their state spaces where differentiability fails. Nevertheless, Caratheodory's idea 
remains a powerful one and it does play an important role in the story. We shall replace it by 
a seemingly more natural idea, namely the existence of irreversible processes. The existence of 
many such processes lies at the heart of thermodynamics. If they did not exist, it would mean that 
nothing is forbidden, and hence there would be no second law. We now show the relation between 
the two concepts. There will be no mention of differentiability, however. 

Caratheodory's principle has been criticized (see, for example, the remark attributed to Walter 
in Truesdell's paper in (Serrin, 1986, Chapter 5)) on the ground that this principle does not tell 
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us where to look for a non adiabatic process that is supposed, by the principle, to exist in every 
neighborhood of every state. In Sect. Ill and V we show that this criticism is too severe because 
the principle, when properly interpreted, shows exactly where to look and, in conjunction with the 
other axioms, it leads to the Kelvin-Planck version of the second law. 

THEOREM 2.9 (Caratheodory's principle and irreversible processes). Let T be a 

state space that is a convex subset o/R" and assume that axioms A1-A7 hold on F. Consider the 
following two statements. 

(1) Existence of irreversible processes: For every point X G F there is aY such that 
X Y. 

(2) Caratheodory's principle: In every neighborhood of every X ^ T there is a point Z E T 
such that X 1^ Z is false. 

Then (1) always implies (2). Indeed, (1) implies the stronger statement that there is a Z such 
that X -< Z is false. On the other hand, if all the forward sectors in T have non-empty interiors 
(i.e., they are not contained in lower dimensional hyperplanes) then (2) implies (1). 

Proof: Suppose that for some X G F there is a neighborhood, Mx of X such that Mx is 
contained in Ax^ the forward sector of X. (This is the negation of the statement that in every 
neighbourhood of every X there is a Z such that X < Z \s false.) Let Y G Ax be arbitrary. By the 
convexity of Ax (which is implied by the axioms), X is an interior point of a line segment joining 
Y and some point Z G Mx ■ By axiom A7, we thus have 

((1 - \)Z, \Y)<X'^ ((1 - A)X, AX) 

for some A G (0, 1). But we also have that ((1 - X)X,XY) ~< ((1 - \)Z,XY) since Z € Ax- This 
implies, by the cancellation law, that Y ~< X. Thus we conclude that for some X, we have that 
X ~<Y implies X ~y. This contradicts (1). In particular, we have shown that (1) =^(2). 

Conversely, assuming that (1) is false, there is a point Xq whose forward sector is given by 
Axo = {Y -.Y ~Xo}. Let X be an interior point of Ax^, i.e., there is a neighborhood of X, Mx, 
which is entirely contained in Axq. All points in Mx are adiabatically equivalent to Xq, however, 
and hence to X, since X G Mx- Thus, (2) is false. ■ 

H. Some further results on uniqueness 

As stated in Theorem 2.2, the existence of an entropy function on a state space F is equivalent 
to the axioms A1-A6 and CH for the multiple scaled copies of F. The entropy function is unique, 
up to an affine change of scale, and according to formula (2.14) it is even sufficient to know the 
relation on the double scaled copies F'^^"'^) x F^^^ in order to compute the entropy. This was the 
observation behind the uniqueness Theorem 2.4 which stated that the restriction of the relation -< 
to the double scaled copies determines the relation everywhere. 

The following very general result shows that it is in fact not necessary to know -< on all 
-p(i-A) ^ p(A) determine the entropy, provided the relation is such that the range of the entropy 
is connected. In this case A = 1/2 suffices. By Theorem 2.8 the range of the entropy is necessarily 
connected if the convex combination axiom A7 holds. 

THEOREM 2.10 (The relation on F x F determines entropy). Let F be a set and -< a 

relation on F x F. Let S be a real valued function on F satisfying the following conditions: 
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(i) S characterizes the relation onV y.V in the sense that 

{X, Y) < {X',Y') if and only if S{X) + S{Y) < S{X') + S{Y') 

(a) The range of S is an interval (bounded or unbounded and which could even be a point). 

Let S* be another function on T satisfying condition (i). Then S and S* are affinely related, 
i.e., there are numbers a > and B such that S*{X) = aS{X) + B for all X & T. In particular, 
S* must satisfy condition (ii). 

Proof: In general, if F and G are any two real valued functions on F x F, such that F{X, Y) < 
F{X',Y') if and only if G(X,Y) < G(X',Y'), it is an easy logical exercise to show that there is a 
monotone increasing function K (i.e., x < y implies K{x) < K{y)) defined on the range of F, so 
that G = KoF. In our case F{X, Y) = S{X) + S{Y). If the range of S is the interval L then the 
range of F is 2L. Thus K, which is defined on 2L, satisfies 



K{S{X) + S{Y)) = S* {X) + S* (Y) (2.23) 

for all X and y in F because both S and S* satisfy condition (i). For convenience, define M on L 
by M{t) = lK{2t). If we now set Y = X in (1) we obtain 

S*{X) = M{S{X)), XeT (2.24) 



and (2.23) becomes, in general, 

^ (^) = + ^^-^^^ 

for all X and y in L. Since M is monotone, it is bounded on all finite subintervals of L. Hence 
(Hardy, Littlewood, Polya 1934) M is both concave and convex in the usual sense, i.e., 

M{tx + (1 - t)y) = tM(x) + (1 - t)M(y) 

for all < t < 1 and x,y £ L. From this it follows that M{x) = ax + B with a > 0. If a were 
zero then S* would be constant on F which would imply that S is constant as well. In that case 
we could always replace a by 1 and replace B hy B — S{X). ■ 

Remark: It should be noted that Theorem 2.10 does not rely on any structural property of F, 
which could be any abstract set. In particular, continuity plays no role; indeed it cannot be defined 
because no topology on F is assumed. The only residue of "continuity" is the requirement that the 
range of >S' be an interval. 

That condition (ii) is not superfluous for the uniqueness theorem may be seen from the following 
simple counterexample. 

EXAMPLE: Suppose the state space F consists of 3 points, Xq, Xi and X2, and let S and S* 
be defined by S{Xo) = S*{Xo) = 0, S{Xi) = S*{Xi) = 1, S{X2)=2>, 5*(X2)=4. These functions 
correspond to the same order relation on F x F, but they are not related by an affine transformation. 

The following sharpening of Theorem 2.4 is an immediate corollary of Theorem 2.10 in the 
case that the convexity axiom A7 holds, so that the range of the entropy is connected. 
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THEOREM 2.11 (The relation on T x T determines the relation everywhere) Let 

-< and -<* be two relations on the multiple scaled copies ofT satisfying axioms A1-A7, and CH for 
r(i-^) X r(^) for each fixed X G [0, 1]. If < and -<* coincide onT xF, i.e., 



{X,Y) ~< {X',Y') if and only if {X,Y) ^* {X',Y') 



for X, X' ,Y,Y' G F, then -< and -<* coincide on all multiple scaled copies ofF. 

As a last variation on the theme of this subsection let us note that uniqueness of entropy does 
even not require knowledge of the order relation -< on all of F x F. The knowledge of -< on a 
relatively thin "diagonal" set will suffice, as Theorem 2.12 shows. 

THEOREM 2.12 (Diagonal sets determine entropy). Let -< be an order relation on 
F X F and let S be a function on F satisfying conditions (i) and (ii) of Theorem 2.10. Let V be a 

subset o/ F X F with the following properties: 

(i) {X, X) &V for every X eF. 

(ii) The set D = {{S{X), S(Y)) G R^ : (X,Y) G V} contains an open subset of (which 
necessarily contains the set {{x,x) : x E Ranges'}^. 

Suppose now that -<* is another order relation on F x F and that S* is a function on F 
satisfying condition (i) of Theorem 2.10 with respect to -<* on F x F. Suppose further, that -< and 
-<* agree on V, i.e.. 



whenever {X,Y) and {X',Y') are both in V. Then -< and -<* agree on all o/F x F and hence, by 
Theorem 2.10, S and S* are related by an affine transformation. 

Proof: By considering points {X,X) G V, the consistency of S and S* implies that S*{X) = 
M{S{X)) for all X G F, where M is some monotone increasing function on L C R. Again, as in 
the proof of Theorem 2.10, 



for all {X, Y) G T>. (Note: In deriving Eq. (2.25) we did not use the fact that F x F was the 
Cartesian product of two spaces; the only thing that was used was the fact that S{X) + S(Y) 
characterized the level sets of F x F. Thus, the same argument holds with F x F replaced by V.) 

Now fix X G F and let x = S{X). Since D contains an open set that contains the point 
{x,x) G R^, there is an open square 



in D. Eqn. (1) holds on Q and so we conclude, as in the proof of Theorem 2.10, that, for y G 
{x — e,x + e) M{y) = ay + B for some a, S, which could depend on Q, a-priori. 

The diagonal {(,t, ,t) : x G L} is covered by these open squares and, by the Heine-Borel theorem, 
any closed, finite section of the diagonal can be covered by finitely many squares Qi,Q2, ■■■,Qn, 
which we order according to their "diagonal point" {xi,Xi). They are not disjoint and, in fact, we 
can assume that Tj := Qi PI Qi+i is never empty. In each interval (xj — e, + e), M(x) = ajX + Bi 
but agreement in the overlap region Tj requires that ai and B^ be independent of i. Thus, S*{X) = 
aS{X) + B for ah X G F, as claimed. ■ 



{X,Y) -< {X',Y') if and only if {X,Y) ^* (X' ,Y') 




(2.26) 



Q = {x — e, X + e) X (x — €, X + e) 
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III. SIMPLE SYSTEMS 



Simple systems are the building blocks of thermodynamics. In general, the equilibrium state of 
a (simple or complex) system is described by certain coordinates called work coordinates and certain 
coordinates called energy coordinates. Physically, the work coordinates are the parameters one can 
adjust by mechanical (or electric or magnetic) actions. We denote work coordinates collectively by 
V because the volume is a typical one. A simple system is characterized by the fact that it has 
exactly one energy coordinate, denoted by U. 

The meaning of these words will be made precise; as always there is a physical interpretation 
and a mathematical one. The remark we made in the beginning of Section II is especially apt here; 
the mathematical axioms and theorems should be regarded as independent of the numerous asides 
and physical discussions that surround them and which are not intrinsic to the logical structure, 
even though they are very important for the physical interpretation. The mathematical description 
of simple systems will require three new assumptions, S1-S3. In our axiomatics simple systems 
with their energy and work coordinates are basic (primitive) concepts that are related to the other 
concepts by the axioms. The statement that they are the building blocks of thermodynamics has 
in our approach the precise meaning that from this section on, all systems under consideration are 
assumed to be scaled products of simple systems. 

Prom the physical point of view, a simple system is a fixed quantity of matter with a fixed 
amount of each element of the periodic table. The content of a simple system can be quite com- 
plicated. It can consist of a mixture of several chemical species, even reactive ones, in which case 
the amount of the different components might change as the external parameters (e.g., the volume) 
change. A simple system need not be spatially homogeneous. Por example a system consisting of 
two vessels, each with a piston, but joined by a heat conducting thread, is simple; it has two work 
coordinates (the volumes of the two vessels), but only one energy coordinate since the two vessels 
are always in thermal equilibrium when the total system is in equilibrium. This example is meant 
to be informal and there is no need to define the words 'piston', 'thread' and 'heat conducting'. It 
is placed here as an attempt at clarification and also to emphasize that our definition of 'simple 
system' is not necessarily the same as that used by other authors. 

An example of a compound, i.e., non-simple system is provided by two simple systems placed 
side by side and not interacting with each other. In this case the state space is just the Cartesian 
product of the individual state spaces. In particular, two energies are needed to describe the state 
of the system, one for each subsystem. 

Some examples of simple systems are: 

(a) One mole of water in a container with a piston (one work coordinate) . 

(b) A half mole of oxygen in a container with a piston and in a magnetic field (two work coordinates, 
the volume and the magnetization). 

(c) Systems (a) and (b) joined by a copper thread (three work coordinates). 

(d) A mixture consisting of 7 moles of hydrogen and one mole of oxygen (one work coordinate). 
Such a mixture is capable of explosively reacting to form water, of course, but for certain 
purposes (e.g., in chemistry, material science and in astrophysics) we can regard a non-reacting, 
metastable mixture as capable of being in an equilibrium state, as long as one is careful not 
to bump the container with one's elbow. 

To a certain extent, the question of which physical states are to be regarded as equilibrium 
states is a matter of practical convention. The introduction of a small piece of platinum in (d) will 
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soon show us that this system is not truly in equilibrium, although it can be considered to be in 
equilibrium for practical purposes if no catalyst is present. 

A few more remarks will be made in the following about the physics of simple systems, es- 
pecially the meaning of the distinguished energy coordinate. In the real world, it is up to the 
experimenter to decide when a system is in equilibrium and when it is simple. If the system satis- 
fies the mathematical assumptions of a simple system — which we present next — then our analysis 
applies and the second law holds for it. Otherwise, we cannot be sure. 

Our main goal in this section is to show that the forward sectors in the state space F of a 
simple system form a nested family of closed sets, i.e., two sectors are either identical or one is 
contained in the interior of the other (Theorem 3.7). Fig. 5 illustrates this true state of affairs, and 
also what could go wrong a priori in the arrangement of the forward sectors, but is excluded by our 
additional axioms S1-S3. Ncstedness of forward sectors means that the comparison principle holds 
within the state space T. The comparison principle for multiple scaled copies of F, which is needed 
for the definition of an entropy function on F, will be derived in the next section from additional 
assumptions about thermal equilibrium. 

A. Coordinates for simple systems 

A (equilibrium) state of a simple system is parametrized uniquely (for thermodynamic pur- 
poses) by a point in R""*"^, for some n > depending on the system (but not on the state). 

A point in R""*"^ is written as X = (U, V) with U a distinguished coordinate called the internal 
energy and with V = (Vi, . . . , V^) G R". The coordinates Vi are called the work coordinates. 

We could, if we wished, consider the case n = 0, in which case wc woTild have a system 
whose states are parametrized by the energy alone. Such a system is called a thermometer or a 
degenerate simple system. These systems must be (and will be in Section IV) treated separately 
because they will fail to satisfy the transversality axiom T4, introduced in Section IV. Prom the 
point of view of the convexity analysis in the present section, degenerate simple systems can be 
regarded as trivial. 

The energy is special, both mathematically and physically. The fact that it can be defined as 
a physical coordinate really goes back to the first law of thermodynamics, which says that the 
amount of work done by the outside world in going adiabatically from one state of the system to 
another is independent of the manner in which this transition is carried out. This amount of work 
is the amount by which a weight was raised or lowered in the physical definition given earlier of an 
adiabatic process. (At the risk of being tiresomely repetitive, we remind the reader that 'adiabatic, 
means neither 'slow' nor 'isolated' nor any restriction other than the requirement that the external 
machinery returns to its original state while a weight may have risen or fallen.) Repeatedly, authors 
have discussed the question of exactly what has to be assumed in order that this fact lead to a 
unique (up to an additive constant) energy coordinate for all states in a system with the property 
that the difference in the value of the parameter at two points equals the work done by the outside 
world in going adiabatically from one point to the other. See e.g., (Buchdahl, 1966), (Rastall, 
1970), and (Boyling, 1972). These discussions are interesting, but for us the question lies outside 
the scope of our inquiry, namely the second law. We simply take it for granted that the state space 
of a simple system can be parametrized by a subset of some R""*"^ and that there is one special 
coordinate, which we call 'energy' and which we label by U. Whether or not this parametrization 
is unique is of no particular importance for us. The way in which U is special will become clear 
presently when we discuss the tangent planes that define the pressure function. 
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Mathematically, wc just have coordinates. The question of which physical variables to attach to 
them is important in making the transition from physics to mathematics and back again. Certainly, 
the coordinates have to be chosen so that we are capable of specifying states in a one-to-one manner. 
Thus, U = energy and V = volume are better coordinates for water than, e.g., H = U + PV and 
P, because U and V are capable of uniquely specifying the division of a multi-phase system into 
phases, while H and P do not have this property. For example, the triple point of water corresponds 
to a triangle in the U, V plane (see Fig. 8), but in the H, P plane the triple point corresponds to 
a line, in which case one cannot know the amount of the three phases merely by specifying a point 
on the line. The fundamental nature of energy and volume as coordinates was well understood by 
Gibbs and others, but seems to have gotten lost in many textbooks. Not only do these coordinates 
have the property of uniquely specifying a state but they also have the advantage of being directly 
tied to the fundamental classical mechanical variables, energy and length. We do not mean to 
imply that energy and volume always suffice. Additional work coordinates, such as magnetization, 
components of the strain tensor, etc., might be needed. 

Associated with a simple system is its state space, which is a non-empty convex and open 
subset r C R"^"'^. This F constitutes all values of the coordinates that the system can reach. F is 
open because points on the boundary of F are regarded as not reachable physically in a finite time, 
but there could be exceptions. 

The reason that F is convex was discussed at length in Section IFF. We assume axioms A1-A7. 
In particular, a state space F, scaled by t > 0, is the convex set 

fW = tr := {tX :X eV} . (3.1) 

Thus, what was formerly the abstract symbol tX is now concretely realized as the point {tU, tV) G 
R'^+i when X = {U, V) G R"+^ 

Remark. Even if F^*) happens to coincide with F as a subset of R"+^ (as it does, e.g., if F 
is the orthant F = R") it is important to keep in mind that the mole numbers that specify the 
material content of the states in F^*) are t-times the mole numbers for the states in F. Hence the 
state spaces must be regarded as different. The photon gas, mentioned in Sect. ILB. is an exception: 
Particle number is not conserved, and 'material content' is not an independent variable. Hence the 
state spaces F*^*) are all physically identical in this case, i.e., no physical measurement can tell them 
apart. Nevertheless it is a convenient fiction to regard them as mathematically distinguishable; 
in the end, of course, they must all have the same properties, e.g., entropy, as a function of the 
coordinates — up to an additive constant, which can always be adjusted to be zero, as discussed 
after Theorem 2.5. 

Usually, a forward sector. Ax, with X = ([/«, V^), contains the 'half-lines' {{U, V°) : U > U°} 
and {{U^,V) : Vi > V^,i = l,...,n} but, theoretically, at least, it might not do so. In other 
words, F might be a bounded subset of R". This happens, e.g., for a quantum spin system. Such a 
system is a theoretical abstraction from the real world because real systems always contain modes, 
other than spin modes, capable of having arbitrarily high energy. We can include such systems 
with bounded state spaces in our theory, however, but then we have to be a bit careful about our 
definitions of state spaces and the forward sectors that lie in them. This partially accounts for what 
might appear to be the complicated nature of the theorems in this section. 

Scaling and convexity might at first sight appear to be requirements that exclude from the 
outset the treatment of 'surface effects' in our framework. In fact, a system like a drop of a 
liquid, where volume and surface effects are coupled, is not a simple system. But as we shall now 
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argue, the state space of such a system can be regarded as a subset of the convex state space 
of a simple system that contains aU the relevant thermodynamic information. The independent 
work coordinates of this system are the volume V and the surface area A. Such a system could, 
at least in principle, be realized by putting the liquid in a rectangular pan made out of such a 
material that the adhesive energy between the walls of the pan and the liquid exactly matches the 
cohesive energy of the liquid. I.e., there is no surface energy associated with the boundary beween 
liquid and walls, only between liquid and air. (Alternatively, one can think of an 'ocean' of liquid 
and separate a fixed amount of it (a 'system') from the rest by a purely fictitious boundary.) By 
making the pan (or the fictuous boundary) longer at fixed breadth and depth and, by pouring in 
the necessary amount of liquid, one can scale the system as one pleases. Convex combination of 
states also has an obvious operational meaning. By varying breadth and depth at fixed length the 
surface area A can be varied independently of the volume V. Lack of scaling and convexity enter 
only when we restrict ourselves to non-convex submanifolds of the state space, defined by subsidiary 
conditions like A = (47r)^/^3^/^y^/^ that are appropriate for a drop of liquid. But such coupling of 
work coordinates is not special to surface effects; by suitable devices one can do similar things for 
any system with more than one work coordinate. The important point is that the thermodynamic 
properties of the constrained system are derivable from those of the unconstrained one, for which 
our axioms hold. 

It should be remarked that the experimental realization of the simple system with volume and 
surface as independent work coordinates described above might not be easy in practice. In fact, 
the usual procedure would be to compare measurments on the liquid in bulk and on drops of liquid, 
and then, by inverting the data, infer the properties of the system where volume and surface are 
independent variables. The claim that scaling and convexity are compatible with the inclusion of 
surface effects amounts to saying that these properties hold after such a 'disentanglement' of the 
coordinates. 

B. Assumptions about simple systems 

As was already stated, we assume the general axioms A1-A7 of Section II. Since the state 
space r of a simple system has a convex structure, we recall from Theorem 2.6 that the forward 
sector of a point X G F, namely Ax = {Y & F : X ^ Y} is a convex subset of T C R""*"^. We now 
introduce three new axioms. It is also to be noted that the comparison hypothesis, CH, is not used 
here — indeed, our chief goal in this section and the next is to derive CH from the other axioms. 

The new axioms are: 

SI) Irreversibility. For each X £ T there is a point 1" G F such that X ~<~< Y . In other 
words, each forward sector. Ax, consists of more than merely points that, like X itself, are 
adiabatically equivalent to X. 

We remark that axiom SI is implied by the thermal transversality axiom T4 in Section IV. 
This fact deserves to be noted in any count of the total number of axioms in our formulation of 
the second law, and it explains why we gave the number of our axioms as 15 in Section I. Axiom 
SI is listed here as a separate axiom because it is basic to the analysis of simple systems and is 
conceptually independent of the notion of thermal equilibrium presented in Section IV. 

By Theorem 2.9 Caratheodory's principle holds. This principle implies that 

XedAx , (3.2) 
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where dAx denotes the boundary of Ax- By 'boundary' we mean, of course, the relative boundary, 
i.e., the part of the usual boundary of Ax, (considered as a subset of R"^"'^) tliat hes in T. 

Since X hes on the boundary of the convex set Ax we can draw at least one support plane to 
Ax that passes through X, i. e., a plane with the property that Ax lies entirely on one side of the 
plane. Convexity alone does not imply that this plane is unique, or that this plane intersects the 
energy axis of F. The next axiom deals with these matters. 

S2) Lipschitz tangent planes. For each X e T the forward sector Ax has a unique support 
plane at X (i.e., Ax has a tangent plane at X), denoted by Hx ■ The tangent plane lix is 
assumed to have a finite slope with respect to the work coordinates and the slope is moreover 
assumed to be a locally Lipschitz continuous function of X. 

We emphasize that this tangent plane to Ax is initially assumed to exist only at X itself. In 
principle, dAx could have 'cusps' at points other than X, but Theorem 3.5 will state that this does 
not occur. 

The precise meaning of the statements in axiom S2 is the following: The tangent plane at 
X = (C/°, V^) is, like any plane in R"+-'-, defined by a linear equation. The finiteness of the slope 
with respect to the work coordinates means that this equation can be written as 

n 

U-U^ + Y,Pi{X){yi-V^) = Q, (3.3) 

in which the X dependent numbers Pi{X) are the parameters that define the slope of the plane 
passing through X. (The slope is thus in general a vector.) The assumption that Pi{X) is finite 
means that the plane is never 'vertical', i.e., it never contains the line {{U, V^) : U G R}. 

The assumption that lix is the unique supporting hyperplane of Ax at X means that the 
linear expression, with coefficients gi, 

n 

U-U' + ^9i{Vi-V^) (3.4) 

i=l 

has one sign for all {U,V) G Ax (i.e., it is > or < for all points in Ax) if and only if 
gi = Pi{X) for all i = l,...,n. The assumption that the slope of the tangent plane is locally 
Lipschitz continuous means that each Pj is a locally Lipschitz continuous function on F. This, in 
turn, means that for any closed ball B G T with finite radius there is a constant c = c{B) such 
that for all X andY e B 

\PiiX)-Pi{Y)\<c\X -Y\j^„+.. (3.5) 

The function X ^ P{X) = (Pi(X), . . . ,P„(X)) from F to R" is called the pressure. Note: 
We do not need to assume that Pj > 0. 

Physical motivation: The uniqueness of the support plane comes from the following physical 
consideration. We interpret the pressure as realized by a force on a spring that is so adjusted that 
the system is in equilibrium at some point {U^,V^). By turning the screw on the spring we can 
change the volume infinitesimally to V^'' + 5V , all the while remaining in equilibrium. In so doing 
we change to U^^ + 5U . The physical idea is that a slow reversal of the screw can take the system 
to (C/° — 5U, — 5V), infinitesimally. The energy change is the same, apart from a sign, in both 
directions. 
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The Lipschitz continuity assumption is weaker than, and is imphed by, the assumption that Pj 
is continuously diffcrcntiable. By Rademacher's theorem, however, a locally Lipschitz continuous 
function is differentiable almost everywhere, but the relatively rare points of discontinuity of a 
derivative are particularly interesting. 

The fact that we do not require the pressure to be a differentiable function of X is important 
for real physics because phase transitions occur in the real world, and the pressure need not be 
differentiable at such transition points. Some kind of continuity seems to be needed, however, 
and local Lipschitz continuity does accord with physical reality, as far as we know. It plays an 
important role here because it guarantees the uniqueness of the solution of the differential equation 
given in Theorem 3.5 below. It is also important in Section V when we prove the differentiability 
of the entropy, and hence the uniqueness of temperature. This is really the only reason we invoke 
continuity of the pressure and this assumption could, in principle, be dropped if we could be sure 
about the uniqueness and differentiablity just mentioned. There are, in fact statistical mechanical 
models with special forces that display discontinuous pressures (see e.g., (Fisher and Milton, 1983)) 
and temperatures (which then makes temperature into an 'interval- valued' function, as we explain 
in Section V) (sec e.g., (Thirring, 1983)). These models arc not claimed to be realistic; indeed, 
there are some theorems in statistical mechanics that prove the Lipschitz continuity of the pressure 
under some assumptions on the interaction potentials, e.g., (Dobrushin and Minlos, 1967). See 
(Griffiths, 1972). 

There is another crucial fact about the pressure functions that will finally be proved in Section 
V, Theorem 5.4. The surfaces dAx will turn out to be the surfaces of constant entropy, S{U,V), 
and evidently, from the definition of the tangent plane (3.3), the functions PiiX) are truly the 
pressures in the sense that 

P,{X) = ||(X) (3.6) 

along the (constant entropy) surface dAx- However, one would also like to know the following 
two facts, which are at the basis of Maxwell's relations, and which are the fundamental defining 
relations in many treatments. 

and 

P,{X)_dS 

T(x) - m^^^' ^ ^ ^ 

where T{X) is the temperature in the state X. Equation (3.7) constitutes, for us, the definition of 
temperature, but we must first prove that S{U, V) is sufficiently smooth in order to make sense of 
(3.7). Basically, this is what Section V is all about. 

In Theorems 3.1 and 3.2 we shall show that Ax is closed and has a non-empty interior, 
Interior(ylx). Physically, the points in Interior (Ax) represent the states that can be reached from 
X, by some adiabatic means, in a finite time. (Of course, the re-establishment of equilibrium 
usually requires an infinite time but, practically speaking, a finite time suffices.) On the other 
hand, the points in dAx require a truly infinite time to reach from X. In the usual parlance 
they are reached from X only by 'quasi-static reversible processes'. However, these boundary 
points can be reached in a finite time with the aid of a tiny bit of cold matter — according to the 
stability assumption. If we wish to be pedantically 'physical' we should exclude dAx from Ax- 
This amounts to replacing -< by -<-<., and we would still be able to carry out our theory, with the 
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help of the stabihty assumption and some unilluminating epsilons and deltas. Thus, the seemingly 
innocuous, but important stability axiom permits us to regard certain infinitely slow processes as 

physically valid processes. 

Our third axiom about simple systems is technical but important. 

S3) Connectedness of the boundary. We assume that dAx is arcwise connected. 

Without this assumption counterexamples to the comparison hypothesis, CH, can be con- 
structed, even ones satisfying all the other axioms. 

Physical motivation: If F G dAx , we think of Y as physically and adiabatically reachable from 
X by a continuous curve in dAx whose endpoints are X and Y. (It is not possible to go from X 
to y by a curve that traverses the interior of Ax because such a process could not be adiabatic.) 
Given this conventional interpretation, it follows trivially that Y,Ze dAx implies the existence of 
a continuous curve in dAx from Y to Z. Therefore dAx must be a connected set. 

We call the family of relatively closed sets {dAx}xer the adiabats of our system. As we 
shall see later in Theorem 3.6, Y G dAx implies that X G dAy- Thus, all the points on any given 
adiabat are equivalent and it is immaterial which one is chosen to specify the adiabat. 

C. The geometry of forward sectors 

In this subsection all points are in the state space of the same fixed, simple system T, if not 
otherwise stated. T is, of course, regarded here as a subset of some R"+^. 

We begin with an interesting geometric fact that complements convexity, in some sense. Sup- 
pose that X,Y, Z are three coUinear points, with Y in the middle, i.e., Y = tX -|- (1 — t)Z with 
Q <t < 1. The convexity axiom A7 tells us that 

X ^Z implies that X ^Y (3.9) 

because X < {{1 — t)X, tX) -< (1 — t)Z, tX) < Y. The next lemma is geometrically related to this, 
but its origins are different. We shall use this lemma in the proof of Theorems 3.3 and 3.7 below. 
LEMMA 3.1 (Collinear points). Let Y = tX + {1 - t)Z with Q < t < I as above and 

suppose that Y ~< Z . Then X ~<Y (and hence X ~< Z). 

Remark: Equation (3.9) and Lemma 3.1 rely only on the convexity of T and on axioms A1-A7. 
The same properties hold for compounds of simple systems (note that the Cartesian product of 
two convex sets is convex) and hence (3.9) and Lemma 3.1 hold for compounds as well. 

Proof: By A7, A5, our hypothesis, and A3 

{tX, (1 - t)Z)) ^Y A {tY, (1 - t)Y) ~< {tY, (1 - t)Z). 

By transitivity, A2, and the cancellation law, Theorem 2.1, tX -< tY. By scaling, A4, X ^Y. ■ 

Our first theorem in this section, about closedness, is crucial because it lies behind many 
of the more complex theorems. Once again, the seemingly innocuous stability axiom A6 plays a 
central role. As we said in Section II, this axiom amounts to some kind of continuity in a setting 
in which, at first, there is not even any topology on the state spaces. Now that we are in R"+^, 
the topology is evident and stability reveals its true character in the statement of closedness in the 
usual topological sense. The following proof has some of the spirit of the proof of Lemma 3.1. 
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THEOREM 3.1 (Forward sectors are closed). The forward sector, Ax, of each point 
X eT is a relatively closed subset ofT, i.e., Closure{Ax) n F = Ax- 

Proof: The proof uses only axioms A1-A7, in particular stability, A6, and convexity, A7, but 
not S1-S3. What we have to prove is that if y € F is on tlic boundary of Ax then Y is in Ax- For 
this purpose we can assume that the set Ax has full dimension, i.e., the interior of Ax is not empty. 
If, on the contrary, Ax lay in some lower dimensional hyperplane then the following proof would 
work, without any changes, simply by replacing F by the intersection of F with this hyperplane. 

Let W be any point in the interior of Ax- Since Ax is convex, and Y is on the boundary of 
Ax, the half-open line segment joining W to Y (call it [VF,y), bearing in mind that Y [W,y)) 
lies in Ax ■ The prolongation of this line beyond Y lies in the complement of Ax and has at least 
one point (call it Z) in F. (This follows from the fact that F is open and Y G F.) For all sufficiently 
large integers n the point y„ defined by 

77 1 

-Yn + 7 TzZ = Y (3.10) 



(n + 1) {n + l) 

belongs to [W,Y). We claim that (X, ^Z) -< (Y, ^Y). If this is so then we are done because, by 
the stability axiom, X ^Y. 

To prove the last claim, first note that {X, ^Z) < (y„, y^Z) because X ^ y„ and by axiom A3. 
By scaling, A4, the convex combination axiom A7, and (3.10) 

F., iz) = tti ( --i-y„, -< '1±1y . (3.11) 

n J n \(n + l) (n + l) J n 

But this last equals {Y, ^Y) by the splitting axiom, A5. Hence {X, ^Z) ~< [Y, ^Y). ■ 

The following theorem uses Theorem 3.1 in an essential way. 

THEOREM 3.2 (Forward sectors have interiors). For all X, the forward sector Ax has 
a non empty interior. 

Proof. The proof uses the transitivity axiom, A2, convexity, A7, the existence of irreversible 
processes, SI, and the tangent plane axiom S2, but neither local Lipschitz continuity of the pressure 
nor the connectedness of the boundary, S3, are required for our proof here. 

We start by remarking that a convex set in R""*"^ either has a non empty interior, or it is 
contained in a hyperplane. We therefore assume that Ax is contained in a hyperplane and show 
that this contradicts the axioms. [An illustrative picture to keep in mind here is that Ax is a 
closed, (two-dimensional) disc in R^ and X is some point inside this disc and not on its perimeter. 
This disc is a closed subset of R^ and X is on its boundary (when the disc is viewed as a subset of 
R'^). The hyperplane is the plane in R'^ that contains the disc] 

Any hyperplane containing Ax is a support plane to Ax at X, and by axiom S2 the support 
plane is unique, so Ax C IIx- If y G Ax, then Ay C Ax C Ilx by transitivity, A2. By the 
irreversibility axiom SI, there exists aye Ax such that Ay 7^ Ax, which implies that the convex 
set Ay C IIx, regarded as a subset of Hx, has a boundary point in Ilx- If ^ G Hx is such a 
boundary point of Ay, then Z G Ay because Ay is closed. By transitivity, Az C Ay C Hx, and 
Az / Hx because Ay / Ax- 

Now Ay, considered as a subset of Ilx, has an (n — l)-dimensional supporting hyperplane at 
Z (because Z is a boundary point). Call this hyperplane 11^. Since Az C Ay, H'^ is a supporting 
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hyperplane for Az, regarded as a subset of lix- Any n-dimensional hyperplanc in R""*"^ that 
contains the (n — l)-dimcnsional hyperplane II^ C Hx clearly supports Az at Z, where Az is 
now considered as a convex subset of R""*"^. Since there are infinitely many such n-dimensional 
hyperplanes in R^^-^, we have a contradiction to the uniqueness axiom S2. ■ 

Thanks to this last theorem it makes sense to talk about the direction of the normal to the 
tangent plane IIx (with respect to the canonical scalar product on R("+^)) pointing to the interior 
of Ax- The part of axiom S2, that requires the tangent plane to have finite slope with respect to 
the work coordinates, means that the normal is never orthogonal to the energy axis. It appears 
natural to extend the continuity requirement of axiom S2 by requiring not only that the slope but 
also the direction of the normal depends continuously on X. Since T is connected it then follows 
immediately that forward sectors are on the 'same side' of the tangent plane, i.e., the projection of 
the normal on the energy axis is either positive for all sectors or negative for all sectors. 

In fact, it is not necessary to invoke this strengthened continuity requirement to prove that 
forward sectors all point the same way. It is already a consequence of axioms A1-A7, SI and the 
finite slope part of axiom S2. We shall prove this below as Theorem 3.3, but leave the reader the 
option to accept it simply as a part of the continuity requirement for tangent planes if preferred. 

As far as our axiomatic framework is concerned the direction of the energy coordinate and 
hence of the forward sectors is purely conventional, except for the proviso that once it has been 
set for one system it is set for all systems. (This follows from Theorem 4.2 in the next section.) 
We shall adopt the convention that they are on the positive energy side. Prom a physical point of 
view there is more at stake, however. In fact, our operational interpretation of adiabatic processes 
in Sect. II involves either the raising or lowering of a weight in a gravitational field and these two 
cases are physically distinct. Our convention, together with the usual convention for the sign of 
energy for mechanical systems and energy conservation, means that we are concerned with a world 
where adiabatic process at fixed work coordinate can never result in the raising of a weight, only 
in the lowering of a weight. The opposite possibility differs from the former in a mathematically 
trivial way, namely by an overall sign of the energy, but given the physical interpretation of the 
energy direction in terms of raising and lowering of weights, such a world would be different from 
the one we are used to. 

Note that (3.7) tells us that the fact that forward sectors point upward is equivalent to the 

temperature being everywhere positive. To illustrate what is involved here, let us consider a system 
of independent spins in a magnetic field, so that each spin has energy either or e. In the 
thermodynamic limit N, U ^ oo with X = U / (Ne) fixed, the entropy per spin is easily calculated 
according to the rules of statistical mechanics to be S/N = —X\nX — (1 — X)\n{l — X). The 
first half of the energy range, < U/ (Ne) < 1/2 has positive temperature while the second half 
1/2 < U/{Ne) < 1 has negative temperature, according to (3.7). How can we reconcile this with 
our formulation of simple systems? That is to say, we insist that the state space T of our spin system 
consists only of the region < U/{Ne) < 1/2, and we ask what feature of our axioms has ruled 
out the complementary region. The answer is that if we included the second half then convexity 
would require that we also include the maximum entropy point X = 1/2. But the forward sector 
of X contains only X itself and this violates axiom SI. 

This example captures the essential feature that lies behind the following general fact. 
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LEMMA 3.2 (Range of energy in forward sectors). Let X = ([/°, F°) G V and assume 
that its forward sector Ax is on the positive energy side of Ilx ■ Then 

Ax n {{U, V^):U eR} = {{U, V^):U> C/°} n T. (3.12) 

(If Ax is on the negative energy side, then (3.12) holds with '>' replaced by '<'.) 

Proof: The left side of (3.12), denoted Jx, is convex and relatively closed in T by Theorem 
3.1. It is not larger than the right side because Ax lies above the tangent plane that cuts the line 
L = {{U,V^) : U G R} at X. If it is strictly smaller than the right side of (3.12), then Jx is a 
compact interval. Let Xi denote its mid point. Then Jx^, the intersection of Ax-^ with the line 
L, is a closed subinterval of Jx and its length is at most half the length of Jx- (Here we have 
used transitivity, closedness, and that Xi is on the boundary of Jx^-) Repeating this procedure we 
obtain a convergent sequence, Xn, n = 1, 2, ... of points in Jx, such that the forward sector of its 
limit point X^o contains only X^^ itself in violation of SI. ■ 

The 'same sidedness' of forward sectors follows from Lemmas 3.1 and 3.2 together with the 
finite slope of tangent planes. 

THEOREM 3.3 (Forward sectors point the same way). IfT is the state space of a 
simple system, and if the forward sector Ax for one X E F is on the positive energy side of the 
tangent plane ILx, then the same holds for all states in F. 

Proof: For brevity, let us say that a state X G F is 'positive' if Ax is on the positive energy 
side of Fix , and that X is 'negative' otherwise. Let / be the intersection of F with a line parallel 
to the [/-axis, i.e., / = {{U, V) & F,U G R} for some V G R"^. If / contains a positive point, Y, 
then it follows immediately from Lemma 3.2 that all points, Z, that lie above it on / (i.e., have 
higher energy) are also positive. In fact, one can pass from Y to Z, and if Z were negative, then, 
using Lemma 3.2 again, one could pass from Z to a state X below Y, violating the positivity of 
Y. Lemma 3.1, on the other hand, immediately implies that all points X below Y are positive, for 
y ^ Z for some Z strictly above F, by SI. By the analogous argument for negative Y we conclude 
that all points on / have the same 'sign'. 

Since F is convex, and therefore connected, the coexistence of positive and negative points 
would mean that there are pairs of points of different sign, arbitrarily close together. Now if X 
and Y are sufficiently close, then the line ly through Y parallel to the U axis intersects both Ax 
and its complement. (This follows easily from the finite slope of the tangent plane, cf. the proof of 
Theorem 3.5 (ii) below.) Transitivity and Lemma 3.2 imply that any point in dAx fl ly has the 
same sign as X, and since all points on ly have the same sign, this applies also to Y. ■ 

From now on we adopt the convention that the forward sectors in F are on the positive energy 
side of all the tangent planes. The mathematical and physical aspects of this choice were already 
discussed above. 

Since negative states are thus excluded (the possibility to do so is the content of Theorem 
3.3), we may restate Lemma 3.2 in the following way, which we call Planck's principle because 
Planck emphasized the importance for thermodynamics of the fact that 'rubbing' (i.e., increasing 
the energy at fixed work coordinate) is an irreversible process (Planck, 1926, 1954). 

THEOREM 3.4 (Planck's principle). // two states, X and Y , of a simple system have 
the same work coordinates, then X ~<Y if and only if the energy of Y is no less than the energy of 
X. 
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Taking our operational definition of the relation -< in Sect. II into account, the 'only if part 
of this theorem is essentially a paraphrasing of the Kelvin-Planck statement in Section I.A., but 
avoiding the concept of 'cooling': 

'No process is possible, the sole result of which is a change in the energy of a simple system 
(without changing the work coordinates) and the raising of a weight.' 

This statement is clearly stronger than Caratheodory's principle, for it explicitly identifies 
states that are arbitrarily close to a given state, but not adiabatically accessible from it. 

It is worth remarking that Planck's principle, and hence this version of the Kelvin-Planck 
statement, already follows from axioms A1-A7, SI and a part of S2, namely the requirement that 
the tangent planes to the forward sectors have finite slope with respect to the work coordinates. 
Neither Lipschitz continuity of the slope, nor the connectedness axiom S3, are needed for this. 
However, although Planck's principle puts severe restrictions on the geometry of forward sectors, 
it alone does not suffice to establish the comparison principle. For instance, the forward sector Ay 
of a point Y on the boundary dAx of another forward sector could be properly contained in Ax- 
In such a situation the relation ^ could not be characterized by an entropy function. In order to 
exclude pathological cases like this we shall now study the boundary dAx of a forward sectors in 
more detail, making full use of the axioms S2 and S3. 

We denote by Px the projection of dAx on R", i.e., 

Px = {V eR"" : {U, V) G dAx for some U G R}. (3.13) 

Clearly, Px ii> a connected subset of R" because of assumption S3. Note that Px might be 
strictly smaller than the projection of Ax- See Figure 4. 

Insert Figure 4 here 



THEOREM 3.5 (Definition and properties of the function ux)- Fix X = (f/°,F°) in 

T. 

(i)- Let Y G dAx- Then Ax has a tangent plane at Y and it is Hy- 
(a). Px is an open, connected subset o/R". 

(Hi). For each V E Px there is exactly one number, ux{V), such that {ux{V),V) G dAx. 

I.e., 

dAx = {{uxiV),V):V epx}. (3.14) 

This ux{V) is given by 

uxiV) =mf{u: iu,V) & Ax}- (3.15) 

The function ux is continuous on Px and locally convex, i.e., ux is convex on any convex subset 
of Px- (Note that Px need not be convex — or even contractible to a point.) Moreover, 

AxD{{U,V):U>ux{V), V e Px}f]r. (3.16) 

(iv). The function ux is a differentiable function on Px with a locally Lipschitz continuous 
derivative and satisfies the system of partial differential equations 

^(V) = -Pjiux{V),V) for j = l,...,n. (3.17) 
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(v). The function ux is the only continuous function defined on Px that satisfies the differential 
equation, (3.17), in the sense of distributions, and that satisfies ux{V^) = . 

Remark: A solution to (3.17) is not guaranteed a priori; an integrability condition on P is 
needed. However, our assumption S2 implies that P describes the boundary of Ax (cf. (i) above), 
so the integrability condition is automatically fulfilled. Thus, a solution exists. It is the Lipschitz 
continuity that yields uniqueness; indeed, it was precisely our desire to have a unique solution to 
(3.17) that motivated axiom S2. 

Proof: (i). Since Y G dAx, Ax has some support plane, 11, at Y. Since Ax is closed by 
Theorem 3.1 we have Y G Ax and hence Ay C Ax by transitivity, A2. Thus 11 also supports Ay 
at Y. By assumption 82, Ay has a unique support plane at Y, namely Ily. Therefore, IT = Ily. 

(ii) . Connectedness of Px follows immediately from assumption S3, i.e., dAx is connected. The 
following proof that Px is open does not use assumption S3. The key fact is that by (i) and S2 the 
tangent plane to the convex set Ax has finite slope at any Y G dAx ■ Pick a Y = (U,V) e dAx- 
Since T is open, the closed cylinder C = {{U' ,V') : \V' - V\ < e, \U' - U\ < y^} with Y at 
its center lies in T for £ > small enough. Since the tangent plane through Y has finite slope, 
the bottom 'disc' D_ = {{U — ^/e,V') : \V' — V\ < e} lies below the tangent plane for e small 
enough and thus belongs to the complement of Ax- Consider the intersection of Ax with the top 
disc, ={([/ + y/e, V) : \V' — V\ < e}. This intersection is compact, convex and contains the 
point (U + ^/£, V) by Lemma 3.2 and A2 (the latter implies that Ay C Ax)- Its boundary is also 
compact and thus contains a point with minimal distance 5 from the cylinder axis (i.e, from the 
point {U + y/e, V) ). We are obviously done if we show that S > 0, for then all lines parallel to the 
cylinder axis with distance < S from the axis intersect both Ax and its complement, and hence 
the boundary dAx- Now, if 5 = 0, it follows from Lemma 3.2 and transitivity that the vertical line 
joining (U + y/e, V) and (U, V) has an empty intersection with the interior of Ax- But then Ax 
has a vertical support plane (because it is a convex set), contradicting S2. 

(iii) . The proof of (3.14)-(3.16) is already contained in Lemma 3.2, bearing in mind that 
Ay C Ax for all Y G dAx- The local convexity of ux follows from its definition: Let C C Px be 
convex, let and be in C and let < A < 1. Then the point V := XV^ + (1 - A)^^ ig in 
C (by definition) and, by axiom A7, {Xux{V^) + (1 - X)uxiV'^),V) is in Ax- Hence, by (3.15), 
ux(y) < ^ux{V^) + (1 — X)uxiV'^)- Finally, every convex function defined on an open, convex 
subset of R" is continuous. 

(iv) . Fix V e Px,^ct B C Px be an open ball centered at V and let Y := {uxiV),V) G dAx- 
By (i) above and (3.4) we have 

ux{v') - ux{v) + ^Pi(y)(y/ -Vi)>o (3.18) 

i 

for all V G B- Likewise, applying (i) above and (3.4) to the point Y' := {ux{V'),V') we have 

uxiV) - uxiV) + ^Pi(Y'){Vi - F/) > . (3.19) 

i 

As V — > V, P{Y') P{Y), since ux is continuous and P is continuous. Thus, if 1 < j < n is fixed 
and if V^' := Vi ioY i ^ j , V- = Vj + e then, taking limits £ — in the two inequalities above, we 
have that 

''^"^')-«<^) ^ -F,(F) , (3.20) 
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which is precisely (3.17). 

By assumption P{Y) is continuous, so ux is continuously difFerentiable, and hence locally 
Lipschitz continuous. But then P(ux{V),V) is locally Lipschitz continuous in V. 

(v). The uniqueness is a standard application of Banach's contraction mapping principle, given 
the important hypothesis that P is locally Lipschitz continuous and the connectedness of the open 
set Px- px- ■ 

According to the last theorem the boundary of a forward sector is described by the unique 

solution of a system of differential equations. As a corollary it follows that all points on the 
boundary arc adiabatically cqTiivalent and thus have the same forward sectors: 

THEOREM 3.6 (Reversibility on the boundary). If Y e dAx, then X e dAy and 
hence Ay = Ax ■ 

Proof: Assume Y = {U^,V^) £ dAx- The boundary dAy is described by the function uy 
which solves Eqs. (3.17) with the condition uy{V^) = U^. But ux , which describes the boundary 
dAx, solves the same equation with the same initial condition. This solution is unique on py by 
Theorem 3.5(v), so we conclude that dAy C dAx and hence py d px- The theorem will be proved 
if we show that Px = Py- Suppose, on the contrary, that Py is strictly smaller than Px- Then, 
since Px is open, there is some point V G Px that is in the boundary of Py, and hence V ^ Py 
since Py is open. We claim that dAy is not relatively closed in T, which is a contradiction since Ay 
must be relatively closed. To see this, let for j = 1, 2, 3, . . . be in py and V as j ^ oo. 

Then uxiV^) ux{V) since ux is continuous. But uy{V-^) = ux{V-''), so the sequence of points 
{uy{V^), V) in Ax converges to Z := {ux{V), V) G F. Thus, Z is in the relative closure of dAy 
but Z dAy because V ^ Py, thereby establishing a contradiction. ■ 

We are now in a position to prove the main result in this section. It shows that T is foliated by 
the adiabatic surfaces dAx, and that the points of T are all comparable. More precisely, X ^-<Y 
if and only if Ay is contained in the interior of Ax , and X ~ y if and only if y G dAx - 

THEOREM 3.7 (Forward sectors are nested). With the above assumptions, i-e., A1-A7 
and S1-S3, we have the following. If Ax and Ay are two forward sectors in the state space, T, of 
a simple system then exactly one of the following holds. 

(a)- Ax = Ay, i-e-, X ^Y - 

(h). Ax C Interior(^y), i.e., Y -« X. 
(c). Ay C Interior(Ax), i.e., X -<-< Y. 
In particular, dAx and dAy are either identical or disjoint. 

Proof: There are three (non-exclusive) cases: 

Case 1. Y e Ax 
Case 2. X £ Ay 
Case 3. X ^Ay and Y ^ Ax . 

By transitivity, case 1 is equivalent to Ay C Ax- Then, either Y G dAx (in which case 
Ay = Ax by Theorem 3.6) or y G Interior (^Ix)- In the latter situation we conclude that dAy C 

Interior(^x)) for otherwise dAy H dAx contains a point Z and Theorem 3.6 would tell us that 
dAy = dAz = dAx, which would mean that Ay = Ax- Thus, case 1 agrees with the conclusion 
of our theorem. 
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Case 2 is identical to case 1, except for interchanging X and Y. 

Therefore, we are left with the case that Y ^ Ax and X ^ Ay- This, we claim, is impossible 
for the following reason. 

Let Z be some point in the interior of Ax and consider the line segment L joining Y to Z 
(which lies in F since T is convex). If we assume Y ^ Ax then part of L lies outside Ax, and 
therefore L intersects dAx at some point W G dAx- By Theorem 3.6, Ax and Aw are the same 
set, soW~<Z (because X -< Z). By Lemma 3.1, Y Z also. Since Z was arbitrary, we learn that 
Interior(Ax) C Ay- By the same reasoning Interior(74y) C Ax- Since Ax and Ay are both closed, 
the assumption that Y ^ Ax and X ^ Ay has led us to the conclusion that they are identical. ■ 

Figure 5 illustrates the content of Theorem 3.7. The end result is that the forward sectors are 
nicely nested and thereby establishes the comparison hypothesis for simple systems, among other 
things. 

Insert Figure 5 here 

The adiabats dAx foliate F and using Theorem 3.5 it may be shown that there is always a 
continuous function a that has exactly these adiabats as level sets. (Such a function is usually 
referred to as an 'empirical entropy'.) But although the sets Ax are convex, the results established 
so far do not suffice to show that there is a concave function with the adiabats as level sets. For this 
and further properties of entropy we shall rely on the axioms about thermal equilibrium discussed 
in the next section. 

As a last topic in this section we would like to come back to the claim made in Section ILA.2. 
that our operational definition of the relation -< coincides with definitions in textbooks based on 
the concept of 'adiabatic process', i.e., a process taking place in an 'adiabatic enclosure'. We 
already discussed the connection from a general point of view in Section ILC, and showed that 
both definitions coincide. However, there is also another point of view that relates the two, and 
which we now present. It is based on the idea that, quite generally , if one relation is included 
in another then the two relations must coincide for simple systems. This very general result is 
Theorem 3.8 below. 

Whatever 'adiabatic process' means, we consider it a minimal requirement that the relation 
based on it is a subrelation of our according to the operational definition in Sect. II. A. More 
precisely, denoting this hypothetical relation based on 'adiabatic process' by -<*, it should be true 
that X -<* Y implies X ~< Y. Moreover, our motivations for the axioms A1-A6 and S1-S3 for -< 
apply equally well to -<*, so we may assume that -<* also satisfies these axioms. In particular, the 
forward sector A^ of X with respect to ^* is convex and closed with a nonempty interior and 
with X on its boundary. The following simple result shows that -< and -<* must then necessarily 
coincide. 

THEOREM 3.8 (There are no proper inclusions). Suppose that -<^^^ and are two 
relations on multiple scaled products of a simple system F satisfying axioms A1-A7 as well as S1-S3- 
If 

X F implies X -<^^'> Y 

for all X,Y £T, then -<(i)=-<(2) . 

Proof: We use superscripts (1) and (2) to denote the two cases. Clearly, the hypothesis is 
equivalent to A^-^ C A^p for all X £T. We have to prove A^^ c ^x^- Suppose not. Then there 
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is a y such that X -^(2) y but X 7^(1) Y. By Theorem 3.7 for ^(1) we have that Y ^(i) X. By 

our hypothesis, 1^ -i^"^^ X, and thus we have X A^^^ Y. 

(2) 

Now we use what we know about the forward sectors of simple systems. A\( has a non-empty 
interior, so the complement of A^^^ in A^^^ contains a point Y that is not on the boundary of 
A^^^ . On the other hand, we just proved that X Y, which implies that Y G dA^^^ . This is a 
contradiction. ■ 
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IV. THERMAL EQUILIBRIUM 



In this section we introduce our axioms about thermal contact of simple systems. We then use 
these assumptions to derive the comparison hypothesis for products of such systems. This will be 
done in two steps. First we consider scaled copies of a single simple system and then products of 
different systems. The key idea is that two simple systems in thermal equilibrium can be regarded 
as a new simple system, to which Theorem 3.7 applies. We emphasize that the word 'thermal' has 
nothing to do with temperature — at this point in the discussion. Temperature will be introduced 
in the next section, and its existence will rely on the properties of thermal contact, but thermal 
equilibrium, which is governed by the zeroth law, is only a statement about mutual equilibrium of 
systems and not a statement about temperature. 

A. Assumptions about thermal contact 

We assume that a relation -< satisfying axioms A1-A6 is given, but A7 and CH are not assumed 
here. We shall make five assumptions about thermal equilibrium, T1-T5. Our first axiom says that 
one can form new simple systems by bringing two simple systems into thermal equilibrium and that 
this operation is adiabatic (for the compound system, not for each system individually). 

Tl) Thermal contact. Given any two simple systems with state spaces Fi and F2, there is another 
simple system, called the the thermal join of Fi and F2, whose state space is denoted by 
A12. The work coordinates in A12 are (Vi, V2) with Vi the work coordinates of Fi and V2 the 
work coordinates of F2. The range of the (single) energy coordinate of A12 is the sum of all 
possible energies in Fi and F2 for the given values of the work coordinates. In symbols: 

A12 = {{U, V,,V2) : U = Ui + U2 with {Uu Fi) G Fi, {U2, V2) G F2}. (4.1) 

By assumption, there is always an adiabatic process, called thermal equilibration that takes 
a state in the compound system, Fi x F2, into a state in A12 which is given by the following 
formula: 

Tl X F2 9 ((C/i, Fi), (C/2, F2)) ^ (?7i + ?72, Fi, F2) G A12. 

Prom the physical point of view, a state in A12 is a "black box" containing the two systems, 
with energies Ui and U2, respectively, such that Ui + U2 = U. The values of Ui and U2 need not 
be unique, and we regard all such pairs (if there is more than one) as being equivalent since, by T2 
below, they are adiabatically equivalent. This state in A12 can be pictured, physically, as having 
the two systems side by side (each with its own pistons, etc.) and linked by a copper thread that 
allows 'heat' to flow from one to the other until thermal equilibrium is attained. The total energy 
U = Ui + U2 can be selected at will (within the range permitted by Vi and V2), but the individual 
energies Ui and U2 will be determined by the properties of the two systems. Note that A12 is 
convex — a fact that follows easily from the convexity of Fi and F2. 

The next axiom simply declares the 'obvious' fact that we can disconnect the copper thread, 
once equilibrium has been reached, and restore the original two systems. 

T2) Thermal splitting. For any point {U,Vi,V2) G A12 there is at least one pair of states, 
{Ui,Vi) e Fi, {U2,V2)) e F2, with U = Ui+ U2, such that 

A12 3 (C/,Fi,y2) ~((C/i,Fi), ([72,^2)) e Tl X F2. 
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In particular, the following is assumed to hold: If {U, V) is a state of a simple system T and 
A e [0, 1] then 

([/, (1 - \)v, XV) A (((1 - x)u, (1 - x)v), {xu, XV)) G r(^-^) x t^^\ 

Wc are now in a position to introduce another kind of equivalence relation among states, in 
addition to A 

Definition. If {{Ui,Vi), {U2, V2)) {Ui + U2, ^1,^2) we say that the states X = {Ui, Vi) and 
y = {U2,V2) are in thermal equilibrium and write 

X ^Y. 

It is clear that X ^ Y implies Y ^ X. Moreover, by axiom T2 and axioms A4 and A5 we 
always have X ~ X. 

The next axiom implies that ~ is, indeed, an equivalence relation. It is difficult to overstate its 
importance since it is the key to eventually establishing the fact that entropy is additive not only 
with respect to scaled copies of one system but also with respect to different kinds of systems. 

T3) Zeroth law of thermodynamics. UX^YandifY^Z then X ^Z. 

The equivalence classes w.r.t. the relation ~ are called isotherms. 

The question whether the zeroth law is really needed as an independent postulate or can be 
derived from other assumptions is the subject of some controversy, see e.g., (Buchdahl, 1986), 
(Walter, 1989), (Buchdahl, 1989). Buchdahl (1986) derives it from his analysis of the second law 
for three systems in thermal equilibrium. However, it is not clear whether the zeroth law comes for 
free; if we really pursued this idea in our framework we should probably find it necessary to invoke 
some sort of assumption about the three-system equilibria. 

Before proceeding further let us point out a simple consequences of T1-T3. 

THEOREM 4.1 (Scaling invariance of thermal equilibrium.) // X and Y are two 

states of two simple systems (possibly the same or possibly different systems) and if X, /i > then 
the relation X ^Y implies XX ~ ^y. 

Proof: {X,XX) = {{Ux,Vx),{XUx, XVx)) ~ {{1 + X)Ux,VxAVx) by axiom T2. But this 
means, by the above definition of thermal equilibrium, that X ^ XX. In the same way, Y ^ ^Y. 
By the zeroth law, axiom T3, this implies XX ^ fiY. ■ 

Another simple consequence of the axioms for thermal contact concerns the orientation of 
forward sectors with respect to the energy. In Theorem 3.3 in the previous section we had already 
showed that in a simple system the forward sectors are either all on the positive energy side or 
all on the negative energy side of the tangent planes to the sectors, but the possibility that the 
direction is different for different systems was still open. The coexistence of systems belonging to 
both cases, however, would violate our axioms Tl and T2. The different orientations of the sectors 
with respect to the energy correspond to different signs for the temperature as defined in Section 
V. Our axioms are only compatible with systems of one sign. 

THEOREM 4.2 (Direction of forward sectors) . The forward sectors of all simple systems 
point the same way, i.e., they are either all on the positive energy side of their tangent planes or 
all on the negative energy side. 
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Proof: This follows directly from Tl and T2, because a system with sectors on the positive 
energy side of the tangent planes can never come to thermal equilibrium with a system whose 
sectors are on the negative side of the tangent planes. To be precise, suppose that Ti has positive 
sectors, r2 has negative sectors and that there are states X = {Ui,Vi) e Ti and Y = {U2, V2) 
such that X ^Y. (Such states exist by T2.) Then, for any sufficiently small (5 > 0, 

(f/i, Vi) < {Ui + 6, Vi) and {U2, V2) -< {U2 - 6, V2) 

by Theorem 3.4 (Planck's principle). With U := Ui + U2 we then have the two relations 

{U,Vi,V2)'^{{U^,Vi), {U2,V2))-^m + S,Vi), {U2,V2)) {U + ^,V^,V2) 

{U,Vi,V2)A{{UuVi), (U2,V2)) -< {{Ui,Vi), {U2-d,V2))'<{U-5,VuV2). 

This means that starting from (U,Vi,V2) G A12 we can move adiabatically both upwards and 
downwards in energy (at fixed work coordinates), but this is impossible (by Theorem 3.3) because 
A12 is a simple system, by Axiom Tl. ■ 
For the next theorem we recall that an entropy function on T is a function that exactly 
characterizes the relation -< on multiple scaled copies of T, in the sense of Theorem 2.2. As defined 
in Section II, entropy functions Si on Fi and S2 on F2 are said to be consistent if together they 
characterize the relation -< on multiple scaled products of Fi and F2 in the sense of Theorem 
2.5. The comparison hypothesis guarantees the existence of such consistent entropy functions, by 
Theorem 2.5, but our present goal is to derive the comparison hypothesis for compound systems 
by using the notion of thermal equilibrium. In doing so, and also in Section V, we shall make use 
of the following consequence of consistent entropy functions. 

THEOREM 4.3 (Thermal equilibrium is characterized by maximum entropy). // 

S is an entropy function on the state space of a simple system, then S is a concave function of U 
for fixed V. If Si and S2 are consistent entropy functions on the state spaces Fi and F2 of two 
simple systems and {Ui, Vi) G Ti, i = 1, 2, then {Ui, Vi) ~ {U2, V2) holds if and only if the sum of 
the entropies takes its maximum value at {{Ui,Vi), {U2,V2)) for fixed total energy and fixed work 
coordinates, i.e., 

max [Si{W, Vi) + S2{{Ui + U2) - W), V2)] = Si{Ui,Vi) + S2{U2, V2). (4.2) 
w 

Proof: The concavity of S is true for any simple system by Theorem 2.8, which uses the convex 
combination axiom A7. It is interesting to note, however, that concavity in U for fixed V follows 
from axioms Tl, T2 and A5 alone, even if A7 is not assumed. In fact, by axiom Tl we have, for 
states {U, V) and (C/', V) of a simple system with the same work coordinates, 

(((1 - X)U, (1 - X)V), (AC/', XV)) -< ((1 - X)U + XU', (1 - X)V, XV). 

By T2, and with U" := (1 — X)U + XU', this latter state is equivalent to 

{{l-X)u",{l-X)V),iXU",XV)), 

which, by A5, is ~ equivalent to (U ,V). Since S is additive and non decreasing under -< this 
implies 

(1 - X)S{U, V) + XS{U', V) < S{{1 - X)U + XU', V). 
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For the second part of oiir theorem, let (Ui, Vi) and {U2, V2) be states of two simple systems. 
Then Tl says that for any W such that {W, Fi) € Ti and {{Ui + U2 - W),V2) e T2 one has 

{{W,Vi),{{Ui + U2)-W),V2)) -< {Ui + U2,Vi,V2). 

The definition of thermal equilibrium says that (?7i + C/2, Vi, V2) i{Ui,Vi)(U2,V2)) if and only if 
{Ui,Vi) ^ {U2,V2). Since the sum of consistent entropies characterizes the order relation on the 
product space the assertion of the lemma follows. ■ 

We come now to what we call the transversality axiom, which is crucial for establishing the 
comparison hypothesis, CH, for products of simple systems. 

T4) Transversality. If T is the state space of a simple system and \i X G F, then there exist 
states Xq ^Xi with Xq X Xi. 

To put this in words, the axiom requires that for every adiabat there exists at least one isotherm 
(i.e., an equivalence class w.r.t. ^ ), containing points on both sides of the adiabat. Note that, for 
each given X, only two points in the entire state space T are required to have the stated property. 
See Figure 6. 

Insert Figure 6 here 

We remark that the condition X -« Xi obviously implies axiom SI. However, as far as 
the needs of this Section IV are concerned, the weaker condition Xq ~< X ~< Xi together with 
Xq -<~< Xi would suffice, and this would not imply SI. The strong version of transversality, stated 
above, will be needed in Section V, however. 

At the end of this section we shall illustrate, by the example of 'thermometers', the significance 
of axiom T4 for the existence of an entropy function. There we shall also show how an entropy 
function can be defined for a system that violates T4, provided its thermal combination with some 
other system (that itself satisfies T4) does satisfy T4. 

The final thermal axiom states, essentially, that the range of temperatures that a simple system 
can have is the same for all simple systems under consideration and is independent of the work 
coordinates. In this section axiom T5 will be needed only for Theorem 4.9. It will also be used 
again in the next section when we establish the existence and properties of temperature. (We 
repeat that the word 'temperature' is used in this section solely as a mnemonic.) 

T5) Universal temperature range. If Fi and F2 are state spaces of simple systems then, for 
every X G Fi and every V G P(F2), where p denotes the projection on the work coordinates, 
piU', V) := V, there is a F G F2 with p{Y) = V, such that X ^Y. 

The physical motivation for T5 is the following. A sufficiently large copy of the first system in 
the state X G Fi can act as a heat bath for the second, i.e., when the second system is brought into 
thermal contact with the first at fixed work coordinates, V, it is always possible to reach thermal 
equilibrium, but the change of X will be very small since X is so large. 

This axiom is inserted mainly for convenience and one might weaken it and require it to hold 
only within a group of systems that can be placed in thermal contact with each other. However, 
within such a group this axiom is really necessary if one wants to have a consistent theory. 
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B. The comparison principle in compound systems 



1. Scaled copies of a single simple system 

We shall now apply the thermal axioms, T4 in particular, to derive the comparison hypothesis, 
CH, for multiple scaled copies of simple systems. 

THEOREM 4.4 (Comparison in multiple scaled copies of a simple system). Let F 

be the state space of a simple system and let ai, . . . , aM,CL[, . . . , be positive real numbers with 
ai + • • • + ttN = a'i + ■ ■ ■ + a'j^. Then all points in aiT x • • • x oatF are comparable to all points in 
a[T X • • • X a'j^F. 

Proof: We may suppose that ai + ■ ■ ■ + = a[ + ■ ■ ■ + a'j^ = 1. We shall show that for any 
points Yi, . . . , Yat, y/, . . . , G F there exist points Xq -<-< Xx in V such that (aiYi, . . . , qnYn) ~ 
((1 - q)Xo, aXi) and (a'^y/, . . . , a'j^Y^) ((1 - a')Xo, a' Xi) with a, a' G R. This wiU prove the 
statement because of Lemma 2.2. 

By Theorem 3.7, the points in T arc comparable, and hence there are points Xq -< Xi such 
that all the points Yi,. . . , Yn, Y{,... , Y^j are contained in the strip S(Xo, Xi) = {X G F : Xq ^ 
X -< Xi\] in particular, these N + M points can be linearly ordered and Xq and Xi can be chosen 
from this set. If Xq Xi then all the points in the strip would be equivalent and the assertion 
would hold trivially. Hence we may assume that Xq -<-< Xi. Moreover, it is clearly sufficient to 
prove that for each Y G 'E{Xq,Xi) one has y ~ ((1 — A)Xo, XX i) for some A G [0, 1], because the 
general case then follows by the splitting and recombination axiom A5 and Lemma 2.2. 

If Xq ^ Xi (or, if there exist Xq ~ Xq and X[ ~ Xi with Xq ^ X[, which is just as good 
for the present purpose) the existence of such a A for a given Y can be seen as follows. For any 
A' G [0, 1] the states ((1 — X')Xq, X'Xi) and ((1 — A')y, A'y) are adiabatically equivalent to certain 
states in the state space of a simple system, thanks to thermal axiom T2. Hence ((1 — X')Xq, X'Xi) 
and y ~ ((1 — A')y, A'y) are comparable. We define 



Since Xq -< Y the set on the right of (4.3) is not empty (it contains 0) and therefore A is well defined 
and < A < 1. Next, one shows that ((1 — X)Xq, XXi) ~ y by exactly the same argument as in 
Lemma 2.3. (Note that this argument only uses that Y and ((1 — X')Xq, X' X') are comparable.) 
Thus, our theorem is established under the hypothesis that Xq 



The following Lemma 4.1 will be needed to show that we can, indeed, always choose Xq and 
Xi so thatXo'^Xi. 

LEMMA 4.1 (Extension of strips). For any state space (of a simple or a compund system), 
if Xq -<-< ~<~< X[ and if 



X = sup{A' G [0, 1] : ((1 - A')^o, X'Xi) -< Y}. 



(4.3) 



X^((l-A)Xo,AXi) 
X,A{(i-X,)x'Q,XiX[) 
^0 ((1 ~ -^o)-'^o, XqXi) 



(4.4) 
(4.5) 
(4.6) 



then 



X ^{{l-^^)XQ,^^X[) 



(4.7) 
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with 

_ AAi 

1 — Ao + AqAi 

Proof: We first consider the special case X = Xi, i.e., A = 1. By simple arithmetic, using the 
cancellation law, one obtains (4.7) from (4.5) and (4.6) with jj, = jii = . The general case 

now follows by inserting the splitting of Xi into (4.4) and recombining. 

■ 

Proof of Theorem 4-4 continued: By the transversality property, each point X lies in some 
strip S(Xo,Xi) with Xq -<-< Xi and Xq ~ Xi. Hence the whole state space can be covered by 
strips E(^?\^f^) with X^"^ X^'^ and X^'^ ^ x['\ Here i belongs to some index set. Since 
all adiabats dAx with X G F are relatively closed in F by axiom S3 we can even cover each X (and 

hence T) with open strips := E(X^'\xf^) = {X : X^'^ X X^''^} with X^'^ ^ x['K 

o 

Moreover, any compact subset, C, of F is covered by a finite number of such strips ^^,i = 1, . . . ,K, 

o o _ 

and if C is connected we may assume that Yli ^ Sj+i 7^ 0- If -'^o denotes the smallest of the elements 
Xq*^ (with respect to the relation -<) and Xi the largest, it follows from Lemma 2.3 that for any 
X G C wc have X ~ {{1 — ^)Xq.jiXi) for some /i. If a finite number of points, Yi, . . . , Y^, Yl, . . . , Yj^^ 
is given, we take C to be a polygon connecting the points, which exists because F is convex. Hence 
each of the points Yi, . . . , Yat, , ■ ■ ■ , Ym equivalent to ((1 — A)Xo, AXi) for some A, and the 
proof is complete. ■ 

The comparison hypothesis, CH, has thus been established for multiple scaled copies of a 
single simple system. From Theorem 2.2 we then know that for such a system the relation -< is 
characterized by an entropy function, which is unique up to an afiine transformation S ^ aS + B. 

2. Products of different simple systems 

Our next goal is to verify the comparison hypothesis for products of different simple systems. 
For this task we shall appeal to the following: 

THEOREM 4.5 (Criterion for comparison in product spaces). Let Fi and F2 be two 

(possibly unrelated) state spaces. Assume there is a relation -< satisfying axioms A1-A6 that holds 
for Fi,F2 and their scaled products. Additionally, -< satisfies the comparison hypothesis CH on Fi 
and its multiple scaled copies and on F2 and its multiple scaled copies but, a-priori, not necessarily 
on Fi X F2 or any other products involving both Fi and F2 
// there are points Xq, Xi G Fi and Yq,Yi G F2 such that 

Xo ■« Xi, Yo ■« n (4.8) 

(Xo,Yi)^(Xi,yo), (4.9) 
then the comparison hypothesis CH holds on products of any number of scaled copies o/Fi and F2. 

Proof: Since the comparison principle holds for Fi and F2 these spaces have canonical entropy 
functions corresponding, respectively, to the reference points Xo,Xi and Yq,Y\. If X G Fi and 
Ai = <S'i(X|Xo,Xi) (in the notation of eq. (2.15)) then, by Lemma 2.3, 

X^((l-Ai)Xo,AiXi) 
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and similarly, for Y eT2 and A2 = S'2(F|lo, ^i), 



y^((i-A2)io,A2yi). 

Set A = |(Ai + A2) and (5 = |(Ai — A2). We then have 

A((l-Ai)Xo,AiXi,(l- A2)Fo,A2yi) by A3 

A ((1 - X)Xo, -SXo, XXi,6Xi, (1 - \)Yo, 6Y0, XYi,-6Yi) by A5 

X)Xo,-SXo,XX,,6Xo,{l- X)Yo,SY,,XYu-SY,) by (4.9), A3, A4 

'^{{l-X)iXo,Yo),X{XuYi)) byA5. 

Thus, every point in Fi x r2 =: ri2 is equivalent to a point of the form ((1 — A)Zo,AZi) in 
(1 — A)ri2 X Ari2 with Zq = (Xq,Yq) and Zi = (Xi,Yi) fixed and A € R. But any two points of 
this form (with the same Zo,Zi, but variable A) are comparable by Lemma 2.2. 

A similar argument extends CH to multiple scaled copies of ri2. Finally, by induction, CH 
extends to scaled products of ri2 and Fi and r2, i.e., to scaled products of arbitrarily many copies 
ofTiandTa. ■ 

We shall refer to a quadruple of points satisfying (4.8) and (4.9) as an entropy calibrator. 
To establish the existence of such calibrators we need the following result. 

THEOREM 4.6 ( Trans versality and location of isotherms). Let F be the state space 
of a simple system that satisfies the thermal axioms TI-T4. Then either 
(i) All points in V are in thermal equilibrium, i.e., X ^Y for all X,Y E F. 

or 

(a) There is at least one adiabat in T (i.e., at least one dAx) that has at least two points that are 
not in thermal equilibrium, i.e., Z ^Y is false for some pair of points Z and Y in dAx- 

Proof: Our proof will be somewhat indirect because it will use the fact — which we already 
proved — that there is a concave entropy function, S, on F which satisfies the maximum principle. 
Theorem 4.3 (for Fi = r2 = T). This means that if 7?. C R denotes the range of 5 on F then the 
sets 



E^ = {X eV: S{X) = a}, aeU 

are precisely the adiabats of F and, moreover, X = (Ui,Vi), Y = (U2,V2) in F satisfy X ~ y if 
and only ii W = U2, maximizes S{Ui + U2 — W,Vi) + S{W,V2) over all choices of W such that 
(C/i + U2 — W,Vi)eT and {W, V2) G F. Furthermore, the concavity of S — and hence its continuity 
on the connected open set F — implies that TZ is connected, i.e., TZ is an interval. 

Let us assume now that (ii) is false. By the zcroth law, T3, ~ is an equivalence relation that 
divides F into disjoint equivalence classes. Since (ii) is false, each such equivalence class must be a 
union of adiabats, which means that the equivalence classes are represented by a family of disjoint 
subsets of TZ. Thus 

7^= IJ 7^„ 

where I is some index set, 7loi is a subset of TZ, TZa n TZfj = for a 7^ /?, and Ea- ^ Et if and only 
if a and r are in some common TZn. 
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We will now prove that each is an open set. It is then an elementary topological fact (using 
the connectedness of F) that there can be only one non-empty TZa, i.e., (i) holds, and our proof is 
complete. 

The concavity of S{U, V) with respect to U for each fixed V implies the existence of an upper 
and lower [/-derivative at each point, which we denote by l/r+ and l/r_, i.e., 

(1/T±)(C/, V) = ± lun s-'[S{U ±e,V)- S{U, V)]. 

Theorem 4.3 implies that X ii and only if the closed intervals [T_ {X),T+{X)] and [T_ (Y) ,T+{Y)] 
are not disjoint. Suppose that some TZa is not open, i.e., there is cr G TZa and either a sequence 

o"! > o"2 > CJ3 • • • , converging to o" or a sequence 0"i < o"2 < o"3 < • • • converging to a with ai TZa ■ 
Suppose the former (the other case is similar). Then (since T± are monotone increasing in U by 
the concavity of S) we can conclude that for every Y G E^^. and every X ^ 

T_{Y)>T+{X). (4.10) 

We also note, by the monotonicity of T± in [/, that (4.10) necessarily holds \iY ^ E^j^ and ii> af, 
hence (1) holds for all Y e E^ for any n > a (because ai\ a). On the other hand, if r < a 

T+{Z) < T_(X) 

for Z ^ Et and X G E^,. This contradicts transversality, namely the hypothesis that there is 
T <u <ii, Z &Er,Y eEu, such that [T-{Z), T+{Z)] n [T-{Y), T+{Y)] is not empty. ■ 

THEOREM 4.7 (Existence of calibrators). Let Fi and F2 he state spaces of simple 
systems and assume the thermal axioms, TI-T4, in particular the transversality property T4. Then 
there exist states Xq,Xi G Fi and Yq,Yi G F2 such that 

Xo Xi and Yq Yi , (4.11) 

{Xo,Y^)A{X,,Yo). (4.12) 



Proof: Consider the simple system A12 obtained by thermally coupling Fi and F2. Fix some 
X = {Ux,Vx) eVi and Y = {Uy, Vy) G F2 with X ^Y. We form the combined state (f){X, Y) = 
{Ux + Uy,Vx,Vy) G A12 and consider the adiabat dA^(^x,Y) ^12- By axiom T2 every point 
Z G dA^(^x^Y^ can be split in at least one way as 

i;{Z) = {{Ux, Vx), {Uy, Vy)) G Fi x F2, (4.13) 

where {Vx, Vy) are the work coordinates of Z with Ux + Uy = Uz and where X = {Ux, Vx), Y = 
{Uy, Vy) are in thermal equilibrium, i.e., X ^Y. If the splitting in (4.13) is not unique, i.e., there 
exist X^^^ , y(^) and X^^) , F^^) satisfying these conditions, then we are done for the following reason: 
First, (X(i),y(i)) A (X(2),y(2)) (by axiom T2). Second, since Uxw + Uyw = Uxm + Uy(2) we 
have either C/j5f(i) < Ux(2) ,Uy(i) > C/y(2) or Uxw > Uxw ,Uy(i) < Uy(2). This implies, by Theorem 
3.4, that either X^^) -<-< X^^) and Y^^) y(i) qj. j^(2) y(i) y(2)_ 

Let us assume, therefore, that the thermal splitting (4.13) of each Z G dA^f^x y) is unique so 
we can write '4>{Z) = {X, Y) with uniquely determined X ^Y. (This means, in particular, that 
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alternative (i) in Theorem 4.6 is excluded.) If some pair {X,Y) obtained in this way does not 
satisfy X X and Y -^Y, e.g., X X liolds, then it follows from axiom A3 and the cancellation 
law that y y, and thus we have obtained points with the desired properties. 

So let us suppose that X ~ X and y ~ F whenever {X, Y) = '4){Z) and Z G dA^(^x,Y)- other 
words, C dAxXdAy- We then claim that all Z G dA^(^x,Y) thermal equilibrium 

with each other. By the zeroth law, T3, (and since P{dA^(^xY)) open and connected, by the 
definition of a simple systems) it suffices to show that all points ([/, Vi, V2) in dA^f^x,Y) with Vi fixed 
are in thermal equilibrium with each other and, likewise, all points (J7, Vi, V2) in ^A^^^x^Y) with V2 
fixed are in thermal equilibrium with each other. Now each fixed Vi in P{Ax) determines a unique 
point {Ui,Vi) G dAx (by Theorem 3.5 (iii)). Since, by assumption, -^{U^Vi^V^) C dAx x dAy we 
must then have 

HU,VuV2) = m,Vi)),iU2,V2)) (4.14) 

with U2 = U — Ui. But (4.14), together with the zeroth law, implies that all points (f7, Vi, V2) G 
dA^(^x,Y) with Vi fixed are in thermal equilibrium with (Ui,Vi) (because (4.14) shows that they 
all have the same Fi component) and hence they are in thermal equilibrium with each other. The 
same argument shows that all points with fixed V2 are in thermal equilibrium. 

We have demonstrated that the hypothesis X Ax and Y '^Y for all {X,Y) G V'('9^<^(x,y)) 
implies that all points in dA^f^x y thermal equilibrium. Since, by Theorem 4.6, at least 

one adiabat in A12 contains at least two points not in thermal equilibrium, the existence of points 
satisfying (1) and (2) is established. ■ 

Having established the entropy calibrators we may now appeal to Theorem 4.5 and summarize 
the discussion so far in the following theorem. 

Theorem 4.8 (Entropy principle in products of simple systems) Assume Axioms Al- 
A7, S1-S3 and TI-T4. Then the comparison hypothesis CH is valid in arbitrary scaled products of 
simple system,s. Hence, by Theorem 2.5, the relation -< among states in such state spaces is char- 
acterized by an entropy function S. The entropy function is unique, up to an overall multiplicative 
constant and one additive constant for each simple system under consideration. 

C. The role of transversality 

It is conceptually important to give an example of a state space F of a simple system and 
a relation -< on its multiple scaled copies, so that all our axioms except T4 are satisfied. In this 
example the comparison hypothesis CH is violated for the spaces F x F and hence the relation 
can not be characterized by an entropy function. This shows that the transversality axiom T4 is 
essential for the proof of Theorem 4.8. The example we give is not entirely academic; it is based 
on the physics of thermometers. See the discussion in the beginning of Section III. 

For simplicity, we choose our system to be a degenerate simple system, i.e., its state space is 
one-dimensional. (It can be interpreted as a system with a work coordinate in a trivial way, by 
simply declaring that everything is independent of V and the pressure function is identically zero) . 
A hypothetical universe consisting only of scaled copies of such a system (in addition to mechanical 
devices) might be referred to as a 'world of thermometers'. The relation -< is generated, physically 
speaking, by two operations: "rubbing", which increases the energy, and thermal equilibration of 
two scaled copies of the system. 
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To describe this in a more formal way we take as our state space T = R+ = {U : U > 0}. 
Rubbing the system increases U and we accordingly define -< on F simply by the relation < on the 
real numbers U. On r^^^^ x r*^^^) we define the forward sector of {XiUi, X2U2) as the convex hull 
of the union AU B of two sets of points, 

A = {{XiU[, X2U^) : C/i < U[, U2 < Uf,} 

and 

B = {{XiU[', X2U'^) :U<U[',U< U'^} 

with 

[7= (Ai + A2)-'(AiC/i + A2C/2). 

This choice of forward sector is minimally consistent with our axioms. The set A corresponds 
to rubbing the individual thermometers while B corresponds to thermal equilibration followed by 
rubbing. 

The forward sector of a point [XiUi, . . . ,XnUn) in the product of more than two scaled copies 
of r is then defined as the convex hull of all points of the form 

(Ait/i, . . . , XiUi . . . XjU'^, . . . XnUn) with {XiUi, XjUj) ~< {XiUi A,[/j). 

The thermal join of r*^'*'^-' and r'^'^^^ is identified with r^^i"'"^^)^ Thermal equilibration is simply 
addition of the energies, and X\Ui is in thermal equilibrium with X2U2 if and only if U\ = U2- 

Since the adiabats and isotherms in T coincide (both consist only of single points) axiom T4 
is violated in this example. The forward sectors in F x F are shown in Figure 7. It is evident 
that these sectors are not nested and hence cannot he characterized by an entropy function. This 
example thus illustrates how violation of the transversality axiom T4 can prevent the existence of 
an entropy function for a relation -< that is well behaved in other ways. 

Insert Figure 7 here 

On the other hand we may recall the usual entropy function for a body with constant heat 
capacity, namely 

SiU) = lnU. (4.15) 

In the above example this function defines, by simple addition of entropies in the obvious way, 
another relation, on the multiple scaled copies of F which extends the relation -< previously 
defined. On F the two relations coincide (since S" is a monotonous function of U), but on F x F 
this is no longer the case: The inequality S{Ui) + S{U2) < S{U[) + S{U2), i.e., U1U2 < U[U2, is 
only a necessary but not a sufficient condition for {Ui,U2) -< (^^1,^2) hold. The passage from 
(C/i, U2) to ([/{, U2) in the sense of the relation ^* (but not -<) may, however, be accomplished by 
coupling each copy of F to another system, e.g., to a Carnot machine that uses the two copies of 
F as heat reservoirs. From the relation -<* one could then reconstruct S in (4.15) by the method 
of Section II. The lesson drawn is that even if T4 fails to hold for a system, it may be possible to 
construct an entropy function for that system, provided its thermal join with some other system 
behaves normally. 

A precise version of this idea is given in the following theorem. 

THEOREM 4.9 (Entropy without transversality). Suppose Fi and F2 are normal or 
degenerate simple systems and assume that axioms A1-A5, T1-T3 and T5 hold for the relation 
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-< on scaled products ofTi and T2. (They already hold for Fi and T2 separately — by definition.) 
Let A12 be the thermal join of Ti and T2 and suppose that A12 and T2 have consistent entropy 
functions S12 o,nd S2, which holds, in particular, if T4 is valid for A12 and r2- Then Ti has an 
entropy function Si that is consistent with S2 and satisfies 

Si2{^{X,Y)) = Si{X) + S2{Y) 

if X ^Y, where (j) is the canonical map Fi x r2 — ^ A12, given by (f){X,Y) = {Ux + Uy^Vx^Vy) if 
X = {Ux,Vx) andY = {Uy,Vy). 

Proof: Given X G Fi we can, by axiom T5, find a y G F2 with X ^ Y , and hence Z := 
(t){X, Y) ^ {X, Y) by axiom T2. If Y' G F2 is another point with X ^Y' and Z' := (p(X, Y') 
then, by axiom T2, {Y',Z) ^ {Y',X,Y) ^ {Y, {X,Y')) A {Y,Z'). Since ^2 and S12 are consistent 
entropies, this means that 

S2{Y') + Si2{Z) = S2{Y) + Si2{Z'), 

or 

Si2iZ) - S2{Y) = Si2iZ') - S2{Y'). (4.16) 
We can thus define Si on Fi by 

Si{X) := Si2{ct>{X,Y)) - 52(F) (4.17) 

for each X e T and for any Y satisfying y ~ X, because, according to (4.16), the right side of 

(4.17) is independent of Y, as long as y ^X. 

To check that Si is an entropy on Fi we show first that the relation 

iXi,X2)^{X[,X!2) 
with Xi, X2, X[, X2 G Fi is equivalent to 

SiiXi) + S2{X2) < Si{X[) + S2{X^). (4.18) 

We pick yi,y2,y/,y2' G F2 with Yi ^Xi,Y2 ^X2, etc. and insert the definition (4.17) of Si into 

(4.18) . We then see that (4.16) is equivalent to 

Si2{cl>{Xi,Yi)) + S2(YI) + Si2{HX2,Y2)) + S2(Y^) 
< Si2{(l>{X[,Yl)) + '52(yi) + Si2{4>{X'2,Y^)) + S2{Y2). 

Since 512 and <S'2 are consistent entropies, this is equivalent to 

(</.(Xi,yi),y/,</.(X2,y2),y2') -< (</)(x(,y/),yi,</>(x^,y2'),y2). 

By the splitting axiom T2 this is equivalent to 

{Xi,Yi,Yl, X2,Y2X) ~< (X'lX, Yi,X!,X, ^2). 

The cancellation law then tells us that this holds if and only if {Xi,X2) -< {X[,X2). 

To verify more generally that Si characterizes the relation on all multiple scaled copies of 
Fi one may proceed in exactly the same way, using the scale invariancc of thermal equilibrium 
(Theorem 4.1) and the hypothesis that 512 and 5*2 are entropy functions, which means that they 
characterize the relation on all products of scaled copies of A12 and F2. I 
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V. TEMPERATURE AND ITS PROPERTIES 



Up to now we have succeeded in proving the existence of entropy functions that do everything 
they should do, namely specify exactly the adiabatic processes that can occur among systems, 
both simple and compound. The thermal join was needed in order to relate different systems, or 
copies of the same system to each other, but temperature, as a numerical quantifier of thermal 
equilibrium, was never used. Not even the concept of 'hot and cold' was used. In the present 
section we shall define temperature and show that it has all the properties it is normally expected 
to have. Temperature, then, is a corollary of entropy; it is epilogue rather than prologue. 

One of our main results here is equation (5.3): Thermal equilibrium and equality of temperature 
are the same thing. Another one is Theorem 5.3 which gives the differentiability of the entropy and 
which leads to Maxwell's equations and other manipulations of derivatives that are to be found in 
the usual textbook treatment of thermodynamics. 

Temperature will be defined only for simple systems (because 1 / (temperature) is the variable 
dual to energy and it is only the simple systems that have only one energy variable) . 

A. Differentiability of entropy and the existence of temperature 

The entropy function, S, defined on the (open, convex) state space, T, of a simple system is 
concave (Theorem 2.8). Therefore (as already mentioned in the proof of Theorem 4.5) the upper 
and lower partial derivatives of S with respect to U (and also with respect to V) exist at every 
point X eV, i.e., the limits 

1/T+{X) = hm 1 [SiU + e,V)- S{U, V)] 

1 /r_ (X) = lim - [S(U, V)-S(U-e, V)] 
£4.0 e 

exist for every X = {U,V) G F. The functions T+(X) (resp. T_{X)) are finite and positive 
everywhere (since S is strictly monotone increasing in U for each fixed V (by Planck's principle. 
Theorem 3.4). These functions arc called, respectively, the upper and lower temperatures. 
Evidently, concavity implies that if Ui < U2 

T_{U^,V) < T+{UuV) < T_{U2,V) < T+{U2,V) (5.1) 

for all V. The concavity of S alone does not imply continuity of these functions. Our goal here is 
to prove continuity by invoking some of our earlier axioms. 
First, we prove a limited kind of continuity. 

LEMMA 5.1 (Continuity of upper and lower temperatures on adiabats). The tem- 
peratures r+ and T_ are locally Lipschitz continuous along each adiabat dAx- I.e., for each X eT 
and each closed ball Bx,r (ZT of radius r and centered at X there is a constant c{X, r) such that 

\T+{X)-T+{Y)\<c{X,r)\X-Y\ 

for all Y G dAx with \X — Y\ < r. The same inequality holds for T_(X). Furthermore, c{X,r) is 
a continuous function of X in any domain D CT such that Bx,2r C F for all X E D. 
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Proof: Recall that the pressure P{X) is assumed to be locally Lipschitz continuous and that 
dU /dVi = Pi on adiabats. Write X = (Uq, Vq) and let the adiabatic surface through X be denoted 
by {Wo{V), V) where Wo(V^) is the unique solution to the system of equations 

with Wo(Vo) = Uq. (Thus Wq is the function ux of Theorem 3.5.) Similarly, for e > we let 
We{V) be the solution to 

with ^^^(Vo) = Uq + £. Of course all this makes sense only if |1/ — Vo| and e are sufficiently small 
so that the points {We{V), V) lie in F. In this region (which we can take to be bounded) we let C 
denote the Lipschitz constant for P, i.e. \PiZ) - P{Z')\ < C\Z - Z'\ for all Z, Z' in the region. 

Let Sf. denote the entropy on {We iY) ^V)'-, it is constant on this surface by assumption. By 
definition 

1 Ss — So 
lim , 



T+{Uo,Vo) sio 
and 

T+{Wo{V), V) = hm We{V)-Wo{V) ^ ^^^^^^ [limG,(F) + l] , 

where Ge{V) := l[Ws{V) — Wo{V) — e]. The lemma will be proved if we can show that there is a 
number D and a radius R>0 such that GeiV) < D\V - Vq\ for all |y - Vb| < -R- 

Let f be a unit vector in the direction of V — Vq and set V{t) = Vq + tv, so that V^(0) = 
Vo,V{t) =V iort = \V- Vo\. Set W,{t) := W,{V{t)) and U{U,t) := v ■ P{U,V{t)). Fix T > so 
that CT < i and so that the ball Bx,2T with center X and radius 2T satisfies Bx,2T C. F. Then, 
for < t < r and e small enough 



Wo{t) = Uo+ f U{Wo{t'),t')dt' 
Jo 

W,{t)-e = Uo+ [ U(W,{t') - £ + e,t')dt' 
Jo 



Define 

ge= sup -[We{t)-S-Wo{t)] = SUp Ge{Vit)). 
0<t<T £ 0<t<T 

By subtracting the equation for Wq from that of We we have that 

t 



\Ge{Vm< j C[l + geW <tC[l + ge]. 







By taking the supremum of the left side over < i < T we obtain < rC[l + ^g], from which 
we see that ge<l (because TC < 1/2). But then \GeiV{t)\ < 2tC or, in other words, \GeiV)\ < 
2\V — Vo\C whenever |F — Vo| < T, which was to be proved. ■ 
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Before addressing our next goal — the equality of and T_ — let us note the maxinivmi entropy 
principle, Theorem 4.2, and its relation to T±. The principle states that if Xi = {Ui,Vi) and 
X2 = {U2, V2) are in F then Xi ^ X2 if and only if the following is true: 

S{X^)+S{X2) = sup{S{Ui+U2-W,Vi)+S{W,V2) : (C/i+C/2-t^, Fi) G T and (^^,^2) e T}. (5.2) 
w 

Since S is concave, at every point X £T there is an upper temperature and lower temperature, as 
given in (5.1). This gives us an ^Hnterval-valued' function on F which assigns to each X the interval 

T{X) = [T_iX),T+{X)]. 

If S is differentiable at X then T_(X) = T^{X) and the closed interval T{X) is then merely the 
single number (J^) {X). If T_(X) = T^(X) we shall abuse the notation slightly by thinking of 
T{X) as a number, i.e., T{X) = T_{X) = T+{X). 

The significance of the interval T{X) is that (5.2) is equivalent to: 

Xi ^ X2 if and only if T{Xi) n T{X2) / 0. 

In other words, if dS/dU makes a jump at X then one should think of X as having all the 
temperatures in the closed interval T{X). 

In Theorem 5.1 we shall prove that the temperature is single- valued, i.e., r_(X) = T^{X). 
Thus, we have the following fact relating thermal equilibrium and temperature: 

Xi^X2 if and only if T{Xi) = T{X2). (5.3) 

THEOREM 5.1 (Uniqueness of temperature). At every point X in the state space of a 
simple system, F, we have 

T+iX)=T_{X), 

i.e., T{X) is the number [(|^) (X)]"\ 

Proof: The proof will rely heavily on the zeroth law, on the continuity of T± on adiabats, 
on transversality, on axiom T5 and on the maximum entropy principle for thermal equilibrium. 
Theorem 4.2. 

Assume that Z G F is a point for which T_^{Z) > T_{Z). We shall obtain a contradiction from 

this. 

Part 1: We claim that for every Y G dAz, T+iY) = T+{Z) and T_{Y) = T_{Z). To this end 
define the (conceivably empty) set if C F by K = {X G F : r+(X) = T_{X) G T{Z)}. If Xi G if 
and G if then T{Xi) = T{X2) G T{Z) by the zeroth law (since X^ ^ Z and X2 ^ Z, and thus 
Xi ^X2). Therefore, there is a single number T* G T{Z) such that T{X) = T* for all X G if . 

Now suppose that Y G dAz and that T+(y) < T+(Z). By the continuity of on dAz 
(Lemma 5.1) there is then another point W G dAz such that T-{Z) < T^(W) < T^{Z), which 
implies that W ^ Z. We write W = {Uw^Vw) and consider fw{U) = S{U, Vw), which is a concave 
function of one variable (namely U) defined on some open interval containing Uw It is a general 
fact about concave functions that the set of points at which fw is differentiable (i.e., T+ = T_) is 
dense and that if C/i > {72 > ^^3 > • • • > Uw is a decreasing sequence of such points converging 



64 



to Uw then T(Ui) converges to T^{Uw)- We denote the corresponding points ([/,;. Vw) by Wi and 
note that, for large i, T(Wi) G T(Z). Therefore T(Wi) = T* for ah large i and hence T+{W) = T* . 

Now use continuity again to find a point R G dAz such that T* = T_^_(W) < < T^(Z). 

Again there is a sequence Ri = (C/% Vr) with T^{Ri) = T-{Ri) = T{Ri) converging downward to 
R and such that T{Ri) T+{R) > T*. But for large i, T{Ri) G T{Z) so T{Ri) = T*. This is a 
contradiction, and we thus conclude that 

T+{Y) = T+{Z) 

for all Y G dAz when T+{Z) > T_{Z). 

Likewise T-(Y) = T-{Z) under the same conditions. 

Part 2: Now we study Pz C R"^, which is the projection of dAz on R"^. By Theorem 3.3, Pz 
is open and connected. It is necessary to consider two cases. 

Case 1: Pz is the projection of T, i.e., = {F G R" : {U,V) G T for some U e R} = p{T). 

In this case we use the transversality axiom T4, according to which there are points X and Y in 
r with X Z Y, (and hence S{X) < S{Z) < S{Y)), but with X ^Y. We claim that 
every X with S{X) < S{Z) has T^{X) < T-{Z). Likewise, we claim that SiX) > S{Z) implies 
that r_(y) > r+(Z). These two facts will contradict the assumption that T{Y) D T{X) is not 
empty. To prove that T+(X) < T_(Z) we consider the line (U, Vx) HT. As U increases from the 
value Ux, the temperature Tj^{U,Vx) also cannot decrease (by the concavity of S). Furthermore, 
{Ux,Vx) -< {U,Vx) if and only if U > Ux by Theorem 3.4. Since Pz = p(T) there is (by 
Theorem 3.4) some Uo > Ux such that iUo,Vx) e dAz- But T_{Uo,Vx) = T_{Z) as we proved 
above. However, T+{X) < T-{Uo, Vx) by (5.1). A similar proof shows that T-{Y) > T+{Z) when 
S{Y) > S{Z). 

Case 2: Pz / P(r). Here we use T5. Both Pz and p{T) are open sets and Pz C p{T). Hence, 
there is a point V in Pz, the closure of Pz, such that V G p(r). Let ly := Ly flT = {([/, 1/) : ?7 G R 
and (?7, V) G F}. If X G Zy then either Z -« X ox X -« Z. (This is so because we are dealing 
with a simple system, which implies that X )~ Z <x X -< Z ^ but we cannot have X ~ Z because 
then X G dAz, which is impossible since ly H dAz is empty.) Suppose, for example, that Z -<-< X 
or, equivalents, S{X) > S{Z). Then S{Y) > S{Z) for ah Y ^ ly (by continuity of 5, and by the 
fact that S{Y) ^ S{Z) on ly). 

Now Ax has a tangent plane H^ at X, which implies that Px H Pz is not empty. Thus there 
is a point 

Wi = {Ui,Vi)^ dAx with Fi G Px n Pz and S{Wi) = S{X) > S{Z). 

By definition, there is a point {Uq, Vi) G dAz with Uq <Ui. By concavity oiU ^ S{U, Vi) we have 
that T-(Wi) > T+([/o) ^i) = ^+(-^)- By continuity of T_ along the adiabat dAx we conclude that 
r_(X) > T^(Z). The same conclusion holds for every Y £ ly and thus the range of temperature 
on the line ly is an interval (ti,t2) with ti > T^{Z). 

By similar reasoning, if i? is in the set {{U,V):V £ Pz,S{U,V) < S{Z)] thenr+(i?) < T_(Z). 
Hence the temperature range on any line with V £ Pz satisfies t\ < T-(Z). This contradicts 
T5 since T_{Z) < T+{Z). A similar proof works liX -<~<Z. ■ 

Having shown that the temperature is uniquely defined at each point of F we are now in a 
position to establish our goal. 
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THEOREM 5.2 (Continuity of temperature). The temperature T{X) = T+{X) = 

T-{X) is a continuous function on the state space, T C R""*"^, of a simple system. 

Proof: Let X^o, Xi, X2, ... be points in T such tliat Xj X^o as j 00. We write Xj = 
{Uj,Vj), we let Aj denote the adiabat dAxj , we let Tj = T{Xj) and we set Ij = {{U,Vj) : {U, Vj) G 
r}. We know that T is continuous and monotone along each Ij because r+ = r_ everywhere by 
Theorem 5.1. We also know that T is continuous on each Aj by Lemma 5.1. In fact, if we assume 
that all the Xj 's are in some sufficiently small ball, B centered at X^o, then by Lemma 5.1 we can 
also assume that for some c < 00 

\T{X)-T{Y)\ < c\X-Y\ 

whenever X and Y are in B and X and Y are on the same adiabat, Aj. Lemma 5.1 also states 
that c can be taken to be independent of X and Y in the ball B. 

By assumption, the slope of the tangent plane Ux is locally Lipschitz continuous, i.e., the 
pressure P{X) is locally Lipschitz continuous. Therefore (again, assuming that B is taken small 
enough) we can assume that each adiabat Aj intersects ^oo in some point, which we denote by Yj. 
Since \Xj — X^o] ^ as j — > 00, we have that Yj — *■ X^o as well. Thus, 

\T{Xj) - T{X^)\ < \T{Xj) - T{Yj)\ + \T{Yj) - T{X^)\. 

As j 00, T{Yj) - r(Xoo) because Yj and X^ are in l^. Also, T{Xj) - T{Yj) because 
\T{Xj) - T{Yj)\ < c\Xj -Yj\< c\Xj - Xool + c\Yj - X^\. ■ 

THEOREM 5.3 (Differentiability of S). The entropy, S, is a continuously differentiable 
function on the state space F of a simple system. 

Proof: The adiabat through a point X G F is characterized by the once continuously differen- 
tiable function, ux{V), on R"'. Thus, S{ux{V), V) is constant, so (in the sense of distributions) 



dU J \ dVj J dVj 

Since 1/T = dS/dU is continuous, and dux/dVj = —Pj is Lipschitz continuous, we see that dS/dVj 
is a continuous function and we have the well known formula 

dS_^Pj^ 
dVj T 



We are now in a position to give a simple proof of the most important property of temperature, 
namely its role in determining the direction of energy transfer, and hence, ultimately, the linear 
ordering of systems with respect to heat transfer (even though we have not defined 'heat' and have 
no intention of doing so). The fact that energy only flows 'downhill' without the intervention of 
extra machinery was taken by Clausius as the foundation of the second law of thermodynamics, as 
we said in Section I. 

THEOREM 5.4 (Energy flows from hot to cold). Let {Ui,Vi) he a point in a state 
space Ti of a simple system and let {U2, V2) he a point in a state space T2 of another simple system. 
Let Ti and T2 he their respective temperatures and assume that Ti > T2. Lf {U{,Vi) and {U2,V2) 
are two points with the same respective work coordinates as the original points, with the same total 
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energy Ui + U2 = U[ + U2, and for which the temperaf/iires are equal to a common value, T (the 
existence of such points is guaranteed by axioms Tl and T2), then 

U[ < Ui and U2 > U2. 



Proof: By assumption Ti > T2 and we claim that 

Ti>T>T2. (5.4) 

(At least one of these inequalities is strict because of the uniqueness of temperature for each state.) 
Suppose that inequality (5.4) failed, e.g., T > Ti > T2. Then we would have that U{ > Ui and 
U2 > U2 and at least one of these would be strict (by the strict monotonicity of U with respect to T, 
which follows from the concavity and differentiability of S). This pair of inequalities is impossible 
in view of the condition Ui + U2 = U[ + U2- 

Since T satisfies (5.4), the theorem now follows from the monotonicity of U with respect to T. 

■ 

From the entropy principle and the relation 

i/T = {ds/du)-^ 

between temperature and entropy we can now derive the usual formula for the Carnot efficiency 

r?c := 1 - (To/Ti) (5.5) 

as an upper bound for the efficiency of a 'heat engine' that undergoes a cyclic process. Let us define 
a thermal reservoir to be a simple system whose work coordinates remains unchanged during 
some process (or which has no work coordinates, i.e. is a degenerate simple system). Consider a 
combined system consisting of a thermal reservoir and some machine, and an adiabatic process for 
this combined system. The entropy principle says that the total entropy change in this process is 

^*^machine ~t~ ^*^reservoir ^ 0- (^'^) 

Let —Q be the energy change of the reservoir, i.e., if Q > 0, then the reservoir delivers energy, 
otherwise it absorbs energy. If T denotes the temperature of the reservoir at the end of the process, 
then, by the convexity of S'reservoir in U, we have 

^'S'reservoir ^ (^■'^) 

Hence 

A5„,achi„e - I > 0. (5.8) 

Let us now couple the machine first to a 'high temperature reservoir' which delivers energy Qi 
and reaches a final temperature Ti, and later to a "low temperature reservoir" which absorbs 
energy —Qq and reaches a final temperature Tq. The whole process is assumed to be cyclic for 



67 



the machine so the entropy changes for the machine in both steps canceL (It returns to its initial 
state.) Combining (5.6), (5.7) and (5.8) we obtain 

Qi/Ti + Qo/To<0 (5.9) 

which gives the usual inequality for the efficiency rj := (Qi + Qo)/Qi'- 

?7< I- (To/Ti) =r?c. (5.10) 

In text book presentations it is usually assumed that the reservoirs are infinitely large, so that their 
temperature remains unchanged, but formula (5.10) remains valid for finite reservoirs, provided Ti 
and To are properly interpreted, as above. 

B. Geometry of isotherms and adiabats 

Each adiabat in a simple system is the boundary of a convex set and hence has a simple 
geometric shape, like a 'bowl'. It must be an object of dimension n when the state space in question 
is a subset of R""*"^. In contrast, an isotherm, i.e., the set on which the temperature assumes a 
given value T, can be more complicated. When n = 1 ( with energy and volume as coordinates) 
and when the system has a triple point, a portion of an isotherm (namely the isotherm through the 
triple point) can be two-dimensional. See Figure 8 where this isotherm is described graphically. 

Insert Figure 8 here 

One can ask whether isotherms can have other peculiar properties. Axiom T4 and Theorem 4.5 
already told us that an isotherm cannot coincide completely with an adiabat (although they could 
coincide over some region). If this were to happen then, in effect, our state space would be cut into 
two non-communicating pieces, and we have ruled out this pathology by fiat. However, another 
possible pathology would be that an isotherm consists of several disconnected pieces, in which case 
we could not pass from one side of an adiabat to another except by changing the temperature. 
Were this to happen then the pictures in the textbooks would really be suspect, but fortunately, 
this perversity does not occur, as we prove next. 

There is one technical point that must first be noted. By concavity and differentiability of the 
entropy, the range of the temperature function over T is always an interval. There are no gaps. 
But the range need not go from to oo — in principle. (Since we defined the state spaces of simple 
systems to be open sets, the point can never belong to the range.) Physical systems ideally always 
cover the entire range (0, oo), but there is no harm, and perhaps even a whiff of physical reality, in 
supposing that the temperature range of the world is bounded. Recall that in axiom T5 we said 
that the range must be the same for all systems and, indeed, for each choice of work coordinate 
within a simple system. Thus, for an arbitrary simple system, F, and V G p(F) 

T^in := inf {T(X) : X e F} = inf {T(?7, V):U eK such that {U, V) G F} 

and 

Tmax := sup{T{X) ■.XeT}= sup{r(C/, V):U eK such that {U, F) £ F}. 

THEOREM 5.5 (Isotherms cut adiabats) Suppose Xq ^ X ^ Xi and Xq and Xi have 
equal temperatures, T{Xq) = T{Xi) = Tq. 
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(1) . If Tiuin < To < Tmax then there is a point X' A X with T{X') = Tq. In other words: The 
isotherm through Xq cuts every adiabat between Xq and Xi . 

(2) . If To = Tmax; then either there is an X' X with T{X') = Tq, or, for any Tq < Tq there 
exist points X'^, X' and X[ with -< X' -< X( and T(X^) = T{X') = T{X[) = T^. 

(3) . IfTo = Tmin, then either there is an X' X with T{X') = Tq, or, for any Tq > Tq there 
exist points X^, X' and X[ with X^ ^ X' ^X ^ X( and T(X^) = r(X') = T{X[) = T^. 

Proof: Step 1. First we show that for every Tq with Tmin < Tq < Tmax the sets ri> := {Y : 
T{Y) > Tq} and fi< := {Y : T{Y) < Tq} are open and connected. The openness follows from 
the continuity of T. Suppose that Qi and 0,2 arc non empty, open sets satisfying r2> = U 
Wc shall show that fti f] 0,2 is not empty, thereby showing that i7> is connected. By axiom T5, 
the range of T, restricted to points (U, V) € F, with V fixed, is independent of V, and hence 
/9(Q>) = /0(F), where p denotes the projection (JJ,V) V. It follows that P(f2i) U ^(^2) = P(r) 
and, since p is an open mapping and p{r) is connected, we have that p{0,i) PI p{0,2) is not empty. 
Now if (Ui, V) £0,1 C n> and if {U2,V) eO,2 <^^>, then, by the monotonicity of T(U, V) in U for 
fixed V, it follows that the line joining (Ui, V) G Oi and (U2, V) £ 0,2 lies entirely in 0^ = f^i uri2- 
Since Oi and O2 are open, Oi D O2 is not empty and Oy is connected. Similarly, O^ is connected. 

Step 2. We show that if Tmin < Tq < Tmax, then there exist points X> , X< , with X> ~ X ~ X< 
and T'(X<) < Tq < r(X>). We write the proof for X>, the existence of X< is shown in the same 
way. In the case that Vx„ G p{^x) the existence of X> follows immediately from the monotonicity 
of T{U,V) in U for fixed V. If Vx^ Pi^x) we first remark that by axiom T5 and because 
Tq < Tmax there exists Xq -< X with Tq < T(Xq). Also, by monotonicity of T in [/ there exists X( 
with X ^ Xi ^ X; and T{X[) > Tq. Hence X^ and X[ both belong to 0>, and X^ -< X -< X(. 
Now r2> is nonempty, open and connected, and dAx splits F \ dAx into disjoint, open sets. Hence 
f2> must cut dAx, i.e., there exists an X> G ri> n dAx- 

Having established the existence of X> and X< we now appeal to continuity of T and con- 
nectedness of dAx (axiom S4) to conclude that there is an X' G dAx with r(X') = Tq. This 
completes the proof of assertion (1). 

Step 3. If To = Tmax and Vx„ G p{Ax), then the existence of X' G dAx with r(X') = Tq 
follows from monotonicity of T in U. Let us now assume that all points on dAx have temperatures 
strictly less than Tmax- By axiom A5 and by continuity and monotonicity of T in [/, there is 
for every Tq < Tq an Xq -< Xq with T{Xq) = Tq. For the same reasons there is an X( with 
X ^ X( ^ Xi and T{X[) = Tq. By the argument of step 2 there is thus an X' G dAx with 
r(X') = Tq. Thus assertion (2) is established. The case Tq = Tmin (assertion (3)) is treated 
analogously. ■ 

C. Thermal equilibrium and uniqueness of entropy 

In Section II we have encountered two general uniqueness theorems for entropy. The first. 
Theorem 2.4, relies only on axioms A1-A6, and CH for the double scaled copies of F, and states 

that an entropy function on F is uniquely determined, up to an affine transformation of scale, by 
the relation -< on the double scaled copies. In the second. Theorem 2.10, it is further assumed that 
the range of the entropy is connected which, in particular, is the case if the convex combination 
axiom A7 holds. Under this condition the relation ^ on F x F determines the entropy. Both these 
uniqueness results are of a very general nature and rely only on the structure introduced in Section 
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II. The properties of entropy and temperature that we have now established on the basis of axioms 
A1-A7, S1-S3 and T1-T5, allow us to supplement these results now with a uniqueness theorem of 
a different kind. 

THEOREM 5.6 (Adiabats and isotherms in T determine the entropy). Let -< and 

-<* be two relations on the multiple scaled copies of a simple system F satisfying axioms A1-A7, 
S1-S3 and T1-T5. Let ~ and ~* denote the corresponding relations of thermal equilibrium between 
states in V. Lf ~< and -<* coincide on F and the same holds for the relations ~ and ~*, then -< and 

-<* coincide everywhere. In other words: The adiabats in T together with the isotherms determine 
the relation -< on all multiple scaled copies of T and hence the entropy is uniquely determined up 
to an affine transformation of scale. 

Proof: Let S and S* be (concave and continuously differentiable) entropies characterizing 
respectively the relations -< and -<* . (The existence follows from axioms A1-A7, S1-S3, and T1-T4, 

as shown in the previous sections.) For points X, K G F we have S{X) = SiY) if and only \i X , 
which holds if and only if S*{X) = S*{Y), because -< and -<* coincide on F by assumption. Hence 
S and S* have the same level sets, namely the adiabats of the simple system. Thus, we can write 

S*{X) = f{S{X)) 

for some strictly monotone function, /, defined on the range of 5" — which is some interval / C R. 
We claim that / is differentiable on I and therefore 

HQ* f)Q 

— iX)=r{S{X)) — iX). (5.11) 

To prove the differentiability note that dS/dU is never zero (since S is strictly monotonic in U 
by Planck's principle. Theorem 3.4). This implies that for each fixed V in P(F) the function 
U I— s- S{U, V) has a continuous inverse K{S, V). (This, in turn, implies that / is open.) Thus, if 
X = {U,V) and S{U,V) = a and if cri, (T2, . . . is any sequence of numbers converging to a, the 
sequence of numbers Uj := K{aj, V) converges to U. Hence 



S*{Uj,V)- S*{U,V) 



Uj-U 



S{U„V)-S{Uj,V) 



Uj-U 



from which we deduce the differentiability of / and the formula (5.11) 
Now consider the function 



(!;)/(§) 



dU J ' 

which is continuous because S and S* are continuously differentiable and (f^) 7^ 0. By Eq. (5.11), 
with g = f, 

G{X)=giSiX)), 

and we now wish to prove that 5 : / ^ R is a constant function (call it a). This will prove our 
theorem because it implies that 

S*{U,V) = aS{U,V) + B{V), 
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This, in turn, implies that B(V) is constant on adiabats. However, the projection of an adiabat, 
dAx, on R" is an open set (because the pressure, which defines the tangent planes, is finite 
everywhere). Thus, the projection p{T) is covered by open sets on each of which B{V) is constant. 
But p(r) is connected (indeed, it is convex) and therefore B{V) is constant on all of p{T). 

To show that g is constant, it suffices to show this locally. We know that X i— G{X) = g{S{X)) 
is constant on adiabats, and it is also constant on isotherms because the level sets of dS/dU and 
dS* /dU both coincide with the isotherms. We now invoke the transversality property and Theorem 
5.5. Let a be any fixed point in the range / of S, i.e., a = S{X) for some X & T. By the 
transversality property there are points Xq , X^ such that 

(To = S{Xo) <a< S{Xi) = (71 

and Xq '^Xi. Now let a = S{X) be any other point in the open interval (ctq, cti). By Theorem 5.5 
there are points X' X and X' X such that X' and X' both lie on the same isotherm (namely 
the isotherm through Xq and Xx). But this means that g{a) = G{S{X')) = G{S{X') = g{a), so g 
is constant. ■ 

Remark: The transversality property is essential for this uniqueness theorem. As a counterex- 
ample, suppose that every isotherm is an adiabat. Then any concave S that has the adiabats as 
its level sets would be an acceptable entropy. 
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VI. MIXING AND CHEMICAL REACTIONS 



A. The difficulty of fixing entropy constants 

We have seen in Sections II and IV that the entropies of all simple systems can be calibrated 
once and for all so that the entropy of any compound system made up of any combination of the 
basic simple systems is exactly the sum of the individual entropies. This global entropy works (i.e., 
it satisfies the entropy principle of Sect. II B and tells us exactly which processes can occur) in 
those cases in which the 'masses' of the individual systems are conserved. That is, splitting and 
recombination of simple systems is allowed, but not mixing of different systems or (chemical or 
nuclear) reactions. 

Nature does allow us to mix the contents of different simple systems, however, (which is not 
to be confused with the formation of a compound system). Thus, we can mix one mole of water 
and one mole of alcohol to form two moles of whiskey. The entropy of the mixture is certainly 
not the sum of the individual entropies, as would be the case if we were forming a compound 
system. Nevertheless, our previous analysis, namely Theorem 2.5, does tell us the entropy of the 
mixture — up to an additive constant ! The multiplicative constant can be, and will be henceforth, 
fixed by the entropy function of one standard system, e.g., one mole of mercury. The reason that 
the multiplicative constant is fixed for the mixture is, as we have stressed, the notion of thermal 
equilibrium. Another way to say this is that once the unit of energy (say Joules) and of temperature 
(say Kelvin) have been fixed, then the entropy of every system, simple and compound, is fixed up 
to an additive constant. Oni assumptions A1-A7, S1-S3 and T1-T5 guarantee this. 

A similar discussion applies to chemical reaction products. After all, the solution of alcohol in 
water can be considered a chemical reaction if one wishes. It requires a certain amount of chemical 
sophistication, which was not available before the enlightenment, to distinguish a mixture from a 
chemical compound. 

The question addressed in this section is this: to what extent can the additive constants 
(denoted by the letter B, in conformity with Theorems 2.3 and 2.5) be determined so that whenever 
a mixture or reaction occurs adiabatically we can say that the entropy has not decreased? To what 
extent is this determination unique? 

One thing that conceivably might have to be discarded, partially at least, is the idea that 
comparability is an equivalence relation. As stated in Section I, to have an equivalence relation 
would require that whenever X -< Z and Y -< Z then X -<Y oi Y -< X (and similarly for Z -< X 
and Z ^Y). If one were to resort to the standard devices of semi-permeable membranes and van 
t'Hofft boxes, as in the usual textbooks, then it would be possible to maintain this hypothesis, 
even for mixing and chemical reactions. In that case, one would be able to prove that the additive 
entropy constants are uniquely determined for all matter, once they have been chosen for the 92 
chemical elements. 

Alas, van t'Hofft boxes do not exist in nature, except in imperfect form. For example, Fermi 
(1956, p. 101), in a discussion of the van t'Hofft box, writes that "The equilibria of gaseous reactions 

can be treated thermodynamically by assuming the existence of ideal semi-permeable membranes" , 
but then goes on to state that "We should notice, finally, that in reality no ideal semi-permeable 
membranes exist. The best approximation of such a membrane is a hot palladium foil, which 
behaves like a semi-permeable membrane for hydrogen." Nevertheless, the rest of Fermi's discussion 
is based on the existence of such membranes! 
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We are the not saying that the comparison hypothesis must be discarded for chemical reactions 
and mixtures; wc arc only raising the logical possibility. As a result, we shall try to organize our 
discussion without using this hypothesis. 

Therefore, we shall have to allow the possibility that if a certain kind of process is theoretically 
possible then entropy increase alone does not determine whether it will actually occur; in particular 
cases it might conceivably be necessary to have a certain minimum amount of entropy increase before 
a reaction can take place. Moreover, the entropy principle of Section II. B conceivably might not 
hold in full generality in the sense that there could be irreversible processes for which entropy does 
not strictly increase. What we do show in this section is that it is possible, nevertheless, to fix the 
entropy constants of all substances in such a way, that the entropy never decreases in an adiabatic 
process. This weak form of the entropy principle is stated in Theorem 6.2. However, it is only 
because of a technicality concerned with uncountably many dimensions that wc cannot prove the 
entropy principle in the strong form and there is no doubt that the 'good case' mentioned at the 
end of this section actually holds in the real world. For all practical purposes we do have the strong 
form because the construction of the constants is done inductively in such a way that at each stage 
it is not necessary to revise the constants previously obtained; this means that in the finite world 
in which we live we are actually dealing, at any given moment, with the countable case. 

A significant point to notice about the additive constants, B, is that they must scale correctly 
when the system scales; a somewhat subtler point is that they must also obey the additivity law 
under composition of two or more systems, Ti XT2, in order that (2.4) holds. As we shall see in 
Sect. B, this latter requirement will not be met automatically and it will take a bit of effort to 
achieve it. 

As a final introductory remark let us mention a computational device that is often used, and 
which seems to eliminate the need for any special discussion about mixing, reactions or other 
variations in the amount of matter. This device is simply to regard the amount of a substance 
(often called the 'particle number' because of our statistical mechanical heritage) as just one more 
work coordinate. The corresponding 'pressure' is called the chemical potential in this case. Why 
does this not solve our problems? The answer, equally simply, is that the comparison hypothesis 
will not hold within a state space since the extended state space will 'foliate' into sheets, in each of 
which the particle number is fixed. Axiom S2 will fail to hold. If particle number is introduced as a 
work coordinate then the price we will have to pay is that there will be no simple systems. Nothing 
will have been gained. The question we address here is a true physical question and cannot be 
eliminated by introducing a mathematical definition. 

B. Determination of additive entropy constants 

Let us consider a collection of systems (more precisely, state spaces), containing simple and/or 

compound systems. Certain adiabatic state changes are possible, and we shall be mainly interested 
in those that take us from one specified system to another, e.g., X ~< Y with X £ T and Y € T' . 
Although there are uncountably many systems (since, in our convention, changing the amount of 
any component means changing the system), we shall always deal in the following with processes 
involving only finitely many systems at one time. In our notation the process of making one mole 
of water from hydrogen and oxygen is carried out by letting X be a state in the compound system 
r consisting of one mole of H2 and one half mole of O2 and by taking y to be a state in the simple 
system, T', consisting of one mole of water. 
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Each system has a wch defined entropy function, e.g., for T there is Sr, and we know from 
Section IV that these can be determined in sucli a way that the sum of tlic entropies increases in 
any adiabatic process in any compound space Fi x r2 x .... Thus, if G Fj and G Fj then 

(Xi,X2,...) ^ (yi,y2,...) if and only if Si (Xi) + SsC^a) + •••< + 52(1^2) + •• • • (6.1) 

where we have denoted Sr^ by Si for short. The additive entropy constants do not matter here 
since each function appears on both sides of this inequahty. 
Now we consider relations of the type 

X with X G F, F e F'. (6.2) 

Our goal is to find constants -B(F), one for each state space F, in such a way that the entropy 
defined by 

S{X) := SriX) + 5(F) for X eV (6.3) 

satisfies 

S{X) < S{Y) (6.4) 

whenever (6.2) holds. 

Additionally, we require that the newly defined entropy satisfies scaling and additivity under 
composition. Since the initial entropies Sr{X) already satisfy them, these requirements become 
conditions on the additive constants -B(F): 

S(tiFi X tsFs) = tiB(Fi) + t2B{r2) (6.5) 

for all state spaces Fi, F2 under consideration and ^1,^2 > 0. 

As we shall see, the additivity requirement is not trivial to satisfy, the reason being that a 
given substance, say hydrogen, can appear in many different compound systems with many different 
ratios of the mole numbers of the constituents of the compound system. 

The condition (6.4) means that 

B{T)-B{T')<Sr'{Y)-Sr{X) 

whenever X Y. Let us denote by D{T,T') the minimal entropy difference for all adiabatic 
processes that can take us from F to F', i.e., 

D(F,F') := inf{5r'(y) - Sr{X) : X ^Y}. (6.6) 

It is to be noted that D(F,F') can be positive or negative and I?(F, F') 7^ D{r',T) in general. 
Clearly D{T, F) = 0. Definition (6.6) makes sense only if there is at least one adiabatic process 
that goes from F to F', and it is convenient to define -D(F,F') = +00 if there is no such process. 
In terms of the D(F,F')'s condition (6.4) means precisely that 

-D(F', F) < B{T) - B{T') < D{T, F') (6.7) 

Although D(T,T') has no particular sign, we can assert the crucial fact that 

-r»(F', F) < £>(F, F') (6.8) 
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This is trivially true if D{T, V) = +00 or D(T' ,T) = +00. If both arc < oc the reason (6.8) is true 
is simply (6.1): By the definition (6.6), there is a pair of states X and Y ^T' such that X ~<Y 
and Sr'{Y) — Sr{X) = D{T,T') (or at least as closely as we please). Likewise, we can find W 
and Z G T', such that Z ~< W and SriW) - S^iZ) = D{r',T). Then, in the compound system 
r X r we have that {X,Z) ^ {W,Y), and this, by (6.1), implies (6.8). Thus -D(r,r') > -00 if 
there is at least one adiabatic process from F' to T. 

Some reflection shows us that consistency in the definition of the entropy constants B{T) 
requires us to consider all possible chains of adiabatic processes leading from one space to another 
via intermediate steps. Moreover, the additivity requirement leads us to allow the use of a 'catalyst' 
in these processes, i.e., an auxiliary system, that is recovered at the end, although a state change 
within this system might take place. 

For this reason we now define new quantities, £^(F, F') and -F(F,r'), in the following way. 
First, for any given F and F' we consider all finite chains of state spaces, F = Fi, F2, . . . , F^v = F' 
such that -D(Fj, Fj+i) < 00 for all i, and we define 

E{T,r') :=inf{L>(Fi,F2) + --- + L'(Fjv-i,FAr)}, (6.9) 

where the infimum is taken over all such chains linking F with F'. Note that E{r, F') < -D(F, F') 
and £'(F,F') could be < 00 even if there is no direct adiabatic process linking F and F', i.e., 
D{r, F') = 00. We then define 

F(F, F') := inf{£;(F x Fq, F' x Fq)}}, (6.10) 

where the infimum is taken over all state spaces Fq. (These are the 'catalysts'.) 
The following properties of F{T, F') are easily verified: 

F(F,F) = (6.11) 

F{tr, tV) = tF{r, T') for t > (6.12) 

F(Fi x F2,F; X F'2) < F(Fi,F;) + F(F2,F^) (6.13) 

F(F X Fo,F' X Fo) = F(F,F') for all Fq. (6.14) 

In fact, (6.11) and (6.12) are also shared by the D's and the E's. The 'subadditivity' (6.13) holds 
also for the E's, but the 'translational invariance' (6.14) might only hold for the F's. 
Prom (6.13) and (6.14) it follows that the F's satisfy the 'triangle inequality' 

F(F, F") < F{r, F') + F{r', r") (6.15) 

(put F = Fi, F" = T[, F' = F2 = F2.) This inequality also holds for the £"s as is obvious from the 
definition (6.9). A special case (using (6.11)) is the analogue of (6.8): 

-F(F',F) < F(F,F') (6.16) 

(This is trivial if F(F',F) or F(F',F) is infinite, otherwise use (6.15) with F = F".) 
Obviously, the following inequalities hold: 

-L»(F',F) < -£;(F',F) < -F(F',F) < F(F,F') < £;(F,F') < £>(F,F'). 
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The importance of the F's for the determination of the additive constants is made clear in the 
following theorem: 

THEOREM 6.1 (Constant entropy differences). IfV andV are two state spaces then 
for any two points X G F and y G F' 

X^Y if and only if Sr{X) + F{V,V') < Sr'{Y). (6.17) 



Remarks: (1). Since F(F, F') < D(F,r') the theorem is trivially true when F(F,r') = +oo, in 
the sense that there is then no adiabatic process from F to F'. The reason for the title 'constant 
entropy differences' is that the minimum jump between the entropies Sr{X) and Sr'{Y) for X ^ y 
to be possible is independent of X. 

(2). There is an interesting corollary of Theorem 6.1. We know, from the definition (6.6), 
that X ^Y only if Sr{X) + L>(F,F') < Sr'{Y). Since D{T,T') < F(F,F'), Theorem 6.1 tells us 
two things: 

X ^Y if and only if Sr{X) + F(F,F') < Sr'iY). (6.18) 

and 

SriX) + D{T,T') < Sr'iY) if and only if Sr{X) + F(F,F') < Sr'iY). (6.19) 

We cannot conclude from this, however, that D(F, F') = F(F, F'). 

Proof: The 'only if part is obvious because F{T, F') < -D(F, F'), and thus our goal is to prove 
the 'if part. For clarity, we begin by assuming that the infima in (6.6), (6.9) and (6.10) are minima, 
i.e., there are state spaces Fq, Fi, F2,..., Fjv and states Xi G Fj and Yi G Fj, for i = 0, ...,N and 
states X eV and y G F' such that 

(x,Xo)^yi 

Xi^Yi+i forz = l,...,iV-l 

XN^iY,Yo) (6.20) 

and F(F, F') is given by 

F{r, r') = D{r x Fo, fo + l>(Fi, F2) + • • • + DirN,r' x Fo) 

N N 

= Sr'iY) + J2 SjiYj) - SriX) - ^ SjiXj). (6.21) 

In (6.21) we used the abbreviated notation Sj for Sr^ and we used the fact that SrxFo = Sr + Sq. 
From the assumed inequality -S'r(^) + F(F, F') < Sr'iY) and (6.21) we conclude that 

N N 

SriX) + Sr'iY) + ^ SjiYj) < SriX) + Sr'iY) + ^ 5,(X,). (6.22) 
j=o j=0 

However, both sides of this inequality can be thought of as the entropy of a state in the compound 
space F := F X F' X Fo X Fi X • • • X Fjv. The entropy principle (6.1) for F then tell us that 

iX, Y,Yo,..., Yn) ^ iX, y, Xo, . . . , Xn) (6.23) 
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On the other hand, using (6.20) and the axiom of consistency, we have that 

(l,Xo,Xi,...,X;v) -< {Y,Yo,Y,,...,Yn). (6.24) 

By the consistency axiom again, we have from (6.24) that {X,Y,Xq, ■ ■ ■ ,Xn) -< 
{Y,Y,Yq,Yi, ...,Yn). From transitivity we then have 

{X,Y,Yo,Y,,...,Yn) ^ {Y,Y,Yo,Yr,...,YN), 

and the desired conclusion, X ~<Y, follows from the cancellation law. 

If F(r, r') is not a minimum, then, for every e > 0, there is a chain of spaces Fq, Ti, r2,..., 
Fjv and corresponding states as in (6.20) such that (6.21) holds to within e and (6.22) becomes (for 
simplicity of notation we omit the explicit dependence of the states and N on e) 

N N 

Sr{X) + Sr'(Y) + J2 ^ji^j) ^ ^r(X) + ^r^^^) + E ^i(^i) + (6-25) 
i=o j=o 

Now choose any auxiliary state space F, with entropy function S, and two states Zo,Zi eV with 

Zq -<-< Zi. The space F itself could be used for this purpose, but for clarity we regard F as distinct. 
Define 6{e) := [S{Zi) - S{Zo)]-'^e. Recalling that 6S{Z) = S{6Z) by scaling, we see that (6.25) 
implies the following analogue of (6.23). 

((5Zo, X, Y,Yo,...,Yn)^ {6Z^ ,X,Y,Xo,...,Xn). (6.26) 

Proceeding as before, we conclude that 

{6Zo,X,Y,Yo,Y^,...,Yn) ^ {6Z^,Y,Y,Yo,Y^,...,Yn), 

and thus (X, bZ^) -< [Y, bZ\) by the cancellation law. However, (5 ^ as e ^ and hence X ^Y 
by the stability axiom. ■ 
According to Theorem 6.1 the determination of the entropy constants -B(F) amounts to satis- 
fying the estimates 

-F(F', F) < S(F) - 5(F') < F(F, F') (6.27) 

together with the linearity condition (6.5). It is clear that (6.27) can only be satisfied with finite 
constants -B(F) and -B(F'), if F(F, F') > — oo. While the assumptions made so far do not exclude 
F(F,F') = — oo as a possibility, it follows from (6.16) that this can only be the case if at the 
same time F(F',F) = +00, i.e., there is no chain of intermediate adiabatic processes in the sense 
described above that allows a passage from F' back to F. For all we know this is not the situation 
encountered in nature and we exclude it by an additional axiom. Let us write F ^ F' and say that 
F is connected to V if F(F, F') < 00, i.e. if there is a finite chain of state spaces, Fq, Fi, F2, . . . , F^r 
and states such that (6.20) holds with X G F and Y G F'. Our new axiom is the following: 

M) Absence of sinks. If F is connected to F' then F' is connected to F, i.e., F -< F' =^ V -< F. 

The introduction of this axiom may seem a little special, even artificial, but it is not. For one 
thing, it is not used in Theorem 6.1 which, like the entropy principle itself, states the condition 
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under which adiabatic process from X to y is possible. Axiom M is only needed for setting the 
additive entropy constants so that (6.17) can be converted into a statement involving S{X) and 
S{Y) alone, as in Theorem 6.2. Second, axiom M should not be misread as saying that if we can 
make water from hydrogen and oxygen then we can make hydrogen and oxygen directly from water 
(which requires hydrolysis) . What it does require is that water can eventually be converted into its 
chemical elements, but not necessarily in one step and not necessarily reversibly. The intervention 
of irreversible processes involving other substances is allowed. Were axiom M to fail in this case 
then all the oxygen in the universe would eventually turn up in water and we should have to rely 
on super novae to replenish the supply from time to time. 

By axiom M (and the obvious transitivity of the relation -< for state spaces), connectedness 
defines an equivalence relation between state spaces, and instead of F ^ F' we can write 

F ~ F' (6.28) 

to indicate that the -< relation among state spaces goes both ways. As already noted, F ~ F' is 
equivalent to — oo < _F(r, F') < oo and — oo < F(r' ,T) < oo. 

Without further assumptions (note, in particular, that no assumptions about 'semi-permeable 
membranes' have been made) we can now derive the entropy principle in the following weak version: 

THEOREM 6.2 (Weak form of the entropy principle). Assume axiom M in addition 
to A1-A7, S1-S3, T1-T5. Then the entropy constants -B(F) can he chosen in such a way that the 
entropy S, defined on all states of all systems by (6.3), satisfies additivity and extensivity (2.4), 
(2.5), and moreover 

X <Y implies S{X) < S{Y). (6.29) 

Proof: The proof is a simple application of the Hahn-Banach theorem (see, e.g., the appendix to 
(Giles, 1964) and (Reed and Simon, 1972)). Consider the set S of all pairs of state spaces (F,F'). 
On S we define an equivalence relation by declaring (F, F') to be equivalent to (F x Fq, F' x Fq) for 
all Fq. Denote by [F, F'] the equivalence class of (F, F') and let JC be the set of all these equivalence 
classes. 

On C we define multiplication by scalars and addition in the following way: 

t[F,F'] := [tF,tF'] fort>0 

t[r,r']:=[-tr',-tr] fort<o 

0[F,F'] := [F,F] = [F',F'] 
[Fi,F;] + [F2,Fy := [Fi xF2,F; xF'2]. 

With these operations £ becomes a vector space, which is infinite dimensional in general. The zero 
element is the class [F, F] for any F, because by our definition of the equivalence relation (F,F) is 
equivalent to (F x F', F x F'), which in turn is equivalent to (F', F'). Note that for the same reason 
[F',F] is the negative of [F,r']. 

Next, we define a function H on C hy 

H{[r,T']) :=F{T,r') 
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Because of (6.14), this function is well defined and it takes values in (— oc, oc]. Moreover, it follows 
from (6.12) and (6.13) that H is homogeneous, i.e., H{t\r,T']) = tiJ([r,r']), and subadditive, i.e., 
//([ri,ri] + [r2,ry) < H{[V^,V'^]) + H{[T-2,T'^\). Likewise, 

G([r,r']):=-F(r',r) 

is homogeneous and super additive, i.e., G([ri,r'i] + \r2X2]) > G^ll^i, Ti]) + G([r2, T^). By (6.16) 
we have G < F so, by the Hahn-Banach theorem, there exists a real-valued linear function L on C 
lying between G and H; that is 

-F{r', r) < L([r, r']) < F(r, r'). (6.30) 

Pick any fixed Fq and define 

B{r) :=L([roxr,ro]). 

By linearity, L satisfies L{[r,T']) = -L{-[r,T']) = -L([r',r]). We then have 
B{T) - B{T') = L([ro X r. To]) + L([ro, To xT']) = L{[T, F']) 

and hence (6.27) is satisfied. ■ 

From the proof of Theorem 6.2 it is clear that the indeterminacy of the additive constants 
B{T) can be traced back to the non uniqueness of the linear function L([r,r']) lying between 
G([r,r']) = -F(r',r) and H{[r,r']) = F(r,r'). This non uniqueness has two possible sources: 
One is that some pairs of state spaces F and V may not be connected, i.e., F{T, T') may be infinite 
(in which case F(r',r) is also infinite by axiom M). The other possibility is that there is a finite, 
but positive 'gap' between G and H, i.e., 

-F(r',r) < F(r,r') (6.31) 

might hold for some state spaces, even if both sides are finite. 

In nature only states containing the same amount of the chemical elements can be transformed 
into each other. Hence F{T, T') = +00 for many pairs of state spaces, in particular, for those 
that contain different amounts of some chemical element. The constants B{T) are therefore never 
unique: For each equivalence class of state spaces (with respect to ~) one can define a constant 
that is arbitrary except for the proviso that the constants should be additive and extensive under 
composition and scaling of systems. In our world, where there are 92 chemical elements (or, strictly 
speaking, a somewhat larger number, A^, since one should count different isotopes as different 
elements) , and this leaves us with at least 92 free constants that specify the entropy of one mole of 
each of the chemical elements in some specific state. 

The other possible source of non uniqueness, a non zero gap (6.31) is, as far as we know, 
not realized in nature, although it is a logical possibility. The true situation seems rather to be 
the following: The equivalence class [T] (with respect to ~) of every state space F contains a 
distinguished state space 

A([F]) = AiFi X ... X XnTn 
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where the Ti are the state spaces of one mole of each of the chemical elements, and the numbers 
(Ai, . . . , Xn) specify the amount of each chemical element in F. We have 



and 



A([tr]) = iA([r]) 
A([rxr']) = A([r])xA([r']). 

-F(A([r]),r]) = F(r,A([r])) 



(6.32) 
(6.33) 

(6.34) 



Moreover (and this is the crucial 'experimental fact'), 



for all r. Note that (6.34) is subject to experimental verification by measuring on the one hand 
entropy differences for processes that synthesize chemical compounds from the elements (possibly 
through many intermediate steps and with the aid of catalysts), and on the other hand for processes 
where chemical compounds arc decomposed into the elements. 
It follows from (6.15) (6.16) and (6.34) that 

F(r, r') = F(r, A([r])) + F(A([r]), r') (6.35) 

and 

-F(r',r) = F(r,r') (6.36) 

for all r' ~ r. Moreover, an explicit formula for -B(r) can be given in this good case: 

s(r) = F(r,A([r]). (6.37) 

If F(r, r') = 00, then (6.27) holds trivially, while for T - T' we have by (6.35) and(6.36) 

B{T) - B{T') = F{r, r') = -F{r', r), (6.38) 

i.e., the inequality (6.27) is saturated. It is also clear that in this case -B(r) is unique up to the choice 
of arbitrary constants for the fixed systems Fi, . . . jFjv. The particular choice (6.37) corresponds 
to putting B{Ti) = for the chemical elements i = 1, . . . ,N. 

From Theorem 6.1 it follows that in the good case just described the comparison principle 
holds in the sense that all states belonging to systems in the same equivalence class are comparable, 
and the relation -< is exactly characterized by the entropy function, i.e., the full entropy principle 
holds. 

If there is a genuine gap, (6.31), then for some pair of state spaces we might have only the weak 
version of the entropy principle, Theorem 6.2. Moreover, it follows from Theorem 6.1 that in this 
case there are no states AT G F and 1" G F' such that X ~ 1". Hence, in order for the full entropy 
principle to hold as far as F and F' are concerned, it is only necessary to ensure that X Y 
implies S{X) < S{Y), and this will be the case (again by Theorem 6.1) if and only if 

-F{r', F) < B{T) - B{r') < F(F, F'). (6.39) 

In other words, we would have the full entropy principle, gaps notwithstanding, if we could be 
sure that whenever (6.31) holds then the inequalities in (6.30) are both strict inequalities. 
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We are not aware of a proof of the Hahn-Banach theorem that will allow us to conclude that 
(6.30) is strict in all cases where (6.31) holds. If, however, the dimension of the linear space £ 
considered in the proof of Theorem 6.2 were finite then the Hahn-Banach theorem would allow us 
to choose the B^s in this way. This is a consequence of the following lemma. 

LEMMA 6.1 (Strict Hahn-Banach). Let V be a finite dimensional, real vector space 
and p : F — ^ R subadditive, i.e., p{x + y) < p{x) + p{y) for all x,y G V, and homogenous, 
i.e., p{Xx) = Xp{x) for all X > 0, x E V. Then there is a linear functional L on V, such that 
—p{—x) < L{x) < p{x) for all x ^ V . Moreover, for those x for which —p{—x) < p{x) holds we 
have the strict inequalities —p{—x) < L{x) < p{x). 

Proof: Note first that subadditivity implies that p[x) — p{—y) < p{x + y) < p{x) + p{y) for 
all x,y ^ V. Define Vq = {x : —p{—x) = p{x)}. li x ^ V and y G Vq) then p{x) + p{y) = 
p{x) — p{—y) < p{x + y) < p{x) + p{y) and hence p{x) + p{y) = p{x + y). (Note that x need not 
belong to Vq.) If x G Vq and A > 0, then p{Xx) = Xp{x) = X{—p{—x)) = —p{—Xx), and if A < 
we have, in the same way, p{Xx) = p((— A)(— x)) = {—X)p{—x) = A(— p(— x)) = Xp{x). Thus Vq is 
a linear space, and p is a linear functional on it. We define L{x) = p{x) for x G Vq. 

Let Vq be an algebraic complement of Vq, i.e., all x G can be written as x = y + z with 
y G Vo, z G V^' and the decomposition is unique if x 7^ 0. On Vq the strict inequality —p{—x) < p{x) 
holds for all x 7^ 0. If L can be defined on Vq such that —p{—x) < L{x) < p{x) for all V^' 9 x 7^ 
we reach our goal by defining L(x + y) = L{x) + L{y) for x G Vq, y G Vq. Hence it suffices to 
consider the case that Vq = {0}. 

Now suppose Vi C 1^ is a linear space and and L has been extended from {0} to Vi such that 
our requirements are fulfilled on Vi, i.e., —p{—x) < L{x) < p{x) for x G Vi, x 7^ 0. Define, for 
X G F 

p{x) = inf {p{x + y)- L{y)}. 
By subadditivity it is clear that for all x 

— p(— x) < —p{—x) < p{x) < p{x). 

Since V is finite dimensional (by assumption) and p continuous (by convexity) the infimum is, in 

fact, a minimum for each x, i.e., p{x) = p{x + y) — L{y) with some y &Vi, depending on x. 

Suppose Vi is not the whole of V. Pick X2 linearly independent of Vi. On the space spanned 
by Vi and X2 we define 

L(Ax2 + xi) = (A/2)(^(x2) - Pi-Xi)) + L(xi). 

if xi eVi, Xe R. 

Then 

p{Xx2 + xi) — L(Ax2 + xi) = p(Ax2 + xi) — -L(xi) — -L(Ax2) > p{Xx2) — L(Xx2) > 
and equality holds in the last inequality if and only if p{Xx2) = —p{—Xx2), i.e., 

p{Xx2 + y)+ p{-Xx2 + y') = L{y + y') < p{y + y'). (6.40) 
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for some y, y' G Vi (depending on AX2). On the other hand, 

p{\x2 + y) +p{-\x2 + y') > p{y + y') 
by subadditivity, so (6.40) imphes 

L{y + y')=p{y + y') (6.41) 
By our assumption about Vi this hold only if y + y' = 0. But then 

p{-Xx2 + y') = p{-Xx2 - y) 

and from (6.40) and (6.41) we get —p{—Xx2 — y) = p(Ax2 + y) and hence Xx2 = —y € Vi. Since 
X2 ^ Vi this is only possible for A = 0, in which case p{xi) = L{xi) and hence (by our assumption 
about ^i), xi = 0. Thus the statement L{x) = p{x) for some x lying in the span of Vi and X2 
implies that a:; = 0. In the same way one shows that L{x) = —p{—x) implies a; = 0. Thus, we have 
succeeded in extending L from Vi to the larger space span{Vi, ^2}. Proceeding by induction we 
obtain L satisfying our requirements on all V. I 
Since the proof of the above version of the Hahn-Banach theorem proceeds inductively over 
subspaces of increasing dimension it generalizes in a straightforward way to spaces of countable 
algebraic dimension. Morover, in such spaces the condition (6.39) could be fulfilled at any induc- 
tion step without modifying the constants previously defined. Hence, even in cases where (6.36) is 
violated, this hypothetical weakening of the full entropy principle could never be detected in real 
experiments involving only finitely many systems. 
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VII. SUMMARY AND CONCLUSIONS 



In this final section we recall our notation for the convenience of the reader and collect all the 
axioms introduced in Sects. 2, 3, 4 and 6. We then review the logical structure of the paper and 
the main conclusions. 

Our axioms concern equilibrium states, denoted by X^Y etc., and the relation of adiabatic 
accessibility between them, li X ~< Y and Y ~< X we write X '^Y , while X Y means that 
X <Y , but not Y < X. States belong to state spaces F, F', . • • of systems, that may be simple or 
compound. The composition of two state spaces F, F' is the Cartesian product F x F' (the order 
of the factors is unimportant); the composition of X G F and F G F' is denoted {X,Y) G F x F'. 
A state X G F may be scaled by a real parameter t > 0, leading to a state tX in a scaled state 
space F(*\ sometimes written tV. For simple systems the states are parametrized by the energy 
coordinate J7 G R and the work coordinates V G R". 

The axioms are grouped as follows: 

A. GENERAL AXIOMS 
Al) Reflexivity. X ^X. 

A2) Transitivity. X ~<Y and Y < Z implies X -< Z. 

A3) Consistency. X ^ X' and Y ^ Y' implies {XX) ^ {X',Y')- 

A4) Scaling invariance. If X ^ y, then tX -< tY for all t > 0. 

A5) Splitting and recombination. For < t < 1, X {tX, (1 - t)X). 

A6) Stability. If {X,eZo) ~< {Y,eZi) holds for a sequence of e's tending to zero and some states 
Zo, Zi, then X ~<Y. 

A7) Convex combination. Assume X and Y are states in the same state space, F, that has a 
convex structure. If t G [0, 1] then {tX, (1 - t)Y) ^ tX + {1 - t)Y . 

B. AXIOMS FOR SIMPLE SYSTEMS 

Let F, a convex subset of R""*"-^ for some n > 0, be the state space of a simple system. 

51) Irreversibility. For each X G F there is a point Y G F such that X -<-< Y. (Note: This 
axiom is implied by T4, and hence it is not really independent.) 

52) Lipschitz tangent planes. For each X G F the forward sector Ax = {Y G F : X ^ Y} has 
a unique support plane at X (i.e.. Ax has a tangent plane at X). The slope of the tangent 
plane is assumed to be a locally Lipschitz continuous function of X. 

53) Connectedness of the boundary. The boundary dAx of a forward sector is connected. 

C. AXIOMS FOR THERMAL EQUILIBRIUM 

Tl) Thermal contact. For any two simple systems with state spaces Fi and F2, there is another 
simple system, the thermal join of Fi and F2, with state space 

A12 = {{U, Vi,V2):U = Ui + U2 with (C/i, V,) G Fi, {U2, V2) G F2}. 

Moreover, 

Fi X F2 9 m Vi), {U2, V2)) ~< {Ui + U2, Vi, V2) G A12. 
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T2) Thermal splitting. For any point {U,Vi,V2) G A12 there is at least one pair of states, 
iUi,Vi) e Ti, (f/2, V2)) e T2, with U = Ui+ U2, such that 

{U,Vi,V2) Am,Vi),{U2,V2)). 

In particular, if {U, V) is a state of a simple system F and A G [0, 1] then 

{u, (1 - x)v, XV) (((1 - x)u, (1 - x)v), {xu, XV)) G f(^-^) X f(^). 

If {U, VuV2) A {{U,,Vi), {U2, V2)) we write {U,, V,) ^ {U2, V2). 
T3) Zeroth law. If X and HY ^ Z then X ^Z. 

T4) Transversality. If F is the state space of a simple system and if X G F, then there exist 

states Xo ~Xi with Xq X Xi. 
T5) Universal temperature range. If Fi and F2 are state spaces of simple systems then, for 

every X G Fi and every V in the projection of F2 onto the space of its work coordinates, there 

is a F G F2 with work coordinates V such that X ^Y. 

D. AXIOM FOR MIXTURES AND REACTIONS 

Two state spaces, F and F' are said to be connected, written F -< F', if there are state spaces 
Fo, Fi, F2,..., Fjv and states Xj G Fj and Fj G Fj, for i = 1, N and states X G F and Y e V 
such that {X,Xo) ~< Yi, X^ -< Fj+i for i = 1, ...,N - 1, and Xn ~< {Y,Yo). 

M) Absence of sinks. If F is connected to F' then F' is connected to F, i.e., F ^ F' F' -< F. 
The main goal of the paper is to derive the entropy principle (EP) from these properties of 

There is a function, called entropy and denoted by S, on all states of all simple and compound 
systems, such that 

a) Monotonicity: If X Y, then S{X) < S{Y), and if X ^Y , then S{X) = S{Y). 
h) Additivitv and extensivity: s({X,X')) = S{X) + S{X') and S{tX) = tS{X). 

Differentiability of 5 as function of the energy and work coordinates of simple systems is also 
proved and temperature is derived from entropy. 

A central result on our road to the EP is a proof, from our axioms, of the comparison hypothesis 
(CH) for simple and compound systems, which says that for any two states X, Y in the same state 
space either X ^Y or Y ^ X holds. This is stated in Theorem 4.8. The existence of an entropy 
function is discussed already in Section II on the basis of Axioms A1-A6 alone assuming in addition 
CH. In the subsequent sections CH is derived from the other axioms. The main steps involved in 
this derivation of CH are as follows. 

The comparison hypothesis (which, once proved, is more appropriately called the comparison 
principle) is first derived for simple systems in Theorem 3.7 in Sect. HI. This proof uses both the 
special axioms S1-S3 of Sect. HI and the general axioms A1-A7 introduced in Sect. II. On the other 
hand, it should be stressed that Theorem 3.7 is independent of the discussion in Sect. II D-E, where 
an entropy function is constructed, assuming the validity of CH. 

The extension of CH to compound systems relies heavily on the axioms for thermal equilibrium 
that are discussed in Sect. IV. The key point is that by forming the thermal join of two simple 
systems we obtain a new simple system to which Theorem 3.7 can be applied. The extension of 
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CH from simple to compound systems is first carried out for products of scaled copies of the same 
simple system (Theorem 4.4). Here the transversality axiom T4 plays an essential role by reducing 
the consideration of states of the compound system that are not in thermal equilibrium to states 
in the thermal join. 

The proof of CH for products of different simple systems requires more effort. The main step 
here is to prove the existence of 'entropy calibrators' (Theorem 4.7). This says that for each pair 
of simple systems Fi, r2 there are exist four states, Xq, Xi G Ti, Yq,Yi G T2 such that Xq -<-< Xi, 
Yq Yi, but {Xq,Yi) ~ {Xi,Xq). In establishing this property, we find it convenient to make 
use of the existence of an entropy function for each of the spaces Fi and F2 separately, which, as 
shown in Sects. H D-E, follows from axioms A1-A6 and the already established property CH for 
products of scaled copies of the same simple system. 

Once CH has been established for arbitrary products of simple systems the entropy principle 
for all adiabatic state changes, except for mixing of different substances and chemical reactions, 
follows from the considerations of Sects. H D-E. An explicit formula for S is given in Eq. (2.20): 
We pick a reference system with two states Zq -<-< Zi, and for each system F a reference point 
Xr G F is chosen in such a way that Xtr = tXr and Xr^xTa = i^rijXr^)- Then, for X eF, 

5(X)=sup{A : {Xr,XZi)^{X,XZo)}. 

(For A < 0, {Xr,XZi) -< {X,XZq) means, per definition, that (Xp,— AZq) -< {X,—XZi), and for 
A = that Xr ^ X.) 

In Section V we prove that for a simple system the entropy function is a once continuously 
differentiable function of the energy and the work coordinates. The convexity axiom A7, which 
leads to concavity of the entropy, and the axiom 82 (Lipschitz tangent planes) are essential here. 
We prove that the usual thermodynamic relations hold, in particular T = {dS/dU)~^ defines the 
absolute temperature. Up to this point neither temperature nor hotness and coldness have actually 
been used. In this section we also prove (in Theorem 5.6) that the entropy for every simple system 
is uniquely determined, up to an affine change of scale, by the level sets of S and T, i.e., by the 
adiabats and isotherms regarded only as sets, and without numerical values. 

In the final Section VI we discuss the problem of fixing the additive entropy constants when 
processes that change the system by mixing and chemical reactions are taken into account. We 
show that, even without making any assumptions about the existence of unrealistic semi-permeable 
membranes, it is always possible to fix the constants in such a way that the entropy remains additive, 
and never decreases under adiabatic processes. This is not quite the full entropy principle, since 
there could still be states with X Y, but S{X) = S{Y). This abnormal possibility, however, 
is irrelevant in practice, and we give a necessary and sufficient condition for the situation to occur 
that seems to be realized in nature: The entropy of every substance is uniquely determined once an 
arbitrary entropy constant has been fixed for each of the chemical elements, and X -<-<Y implies 
that S{X) < S{Y). 

After this summary of the logical structure of the paper we add some remarks on the relation of 
our treatment of the second law and more conventional formulations, e.g., the classical statements 
of Kelvin, Clausius and Caratheodory paraphrased in Sect. I. A. What immediately strikes the eye 
is that these classical formulations are negative statements: They claim that certain processes are 
not possible. Thus, the Clausius formulation essentially says that thermal contact leads to an 
irreversible process. On the other hand, what the founding fathers seem to have taken for granted, 
is that there also exist reversible processes. Thus the Clausius inequality, J SQ/T < 0, which 
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ostensibly follows from his version of the second law and is the starting point for most textbook 
discussions of entropy, does not by itself lead to an entropy function. What is needed in this 
formulation is the existence of reversible processes, where equality holds (or at least processes that 
approximate equality arbitrarily closely). One might even question the possibility of attaching 
a precise meaning to ^6Q' and 'T' for irreversible processes. (See, however, Eq. (5.8) and the 
discussion preceding it, where the symbols are given a precise meaning in a concrete situation.) 

The basic question we set out to examine is this: Why can adiabatic processes within a system 
be exactly characterized by the increase (more precisely, non-decrease) of an additive entropy 
function? In Section II, where the comparison principle CH is assumed, an answer is already given: 
It is because all reasonable notions of adiabatic accessibility should satisfy axioms A1-A6, and 
these axioms, together with CH, are equivalent to the existence of an additive entropy function that 
characterizes the relation. This is expressed in Theorem 2.2. If we now look at axioms A1-A6 and 
the comparison principle we see that these are all positive statements about the relation -<: They 
all say that certain elementary processes are possible (provided some other processes are possible), 
and none of them says that some processes are impossible. In particular, the trivial case, when 
everything is accessible form everything else, is not in conflict with A1-A6 and the comparison 
principle: It corresponds to a constant entropy. 

Prom this point of view the existence of an entropy function is an issue that can, to a large 
extent, be discussed independently of the second law, as originally formulated by the founders 
(as given in Section LA). The existence of entropy has more to do with comparability of states 
and reversibility than with irreversibility. In fact, one can conceive of mathematical examples of a 
relation that is characterized by a function S and satisfies A1-A6 and CH, but 5 is a constant in 
a whole neighbourhood of some points — and the Clausius inequality fails. Conversely, the example 
of the 'world of thermometers', discussed in Sect. IV. D and Fig. 7. is relevant in this context. Here 
the second law in the sense of Clausius holds, but the Clausius equality J 5Q/T = cannot be 
achieved and there is no entropy that characterizes the relation for compound systems! 

In our formulation the reversibility required for the definition of entropy is a consequence of the 
comparison principle and the stability axiom A3. (The latter allows us to treat reversible processes 
as limiting cases of irreversible processes, which arc, strictly speaking the only processes realized 
in nature.) This is seen most directly in Lemma 2.3, which characterizes the entropy of a state in 
terms of adiabatic equivalence of this state with another state in a compound system. This lemma 
depends crucially on CH (for the compound system) and A3. 

So one may ask what, in our formulation, corresponds to the negative statements in the classical 
versions of the second law. The answer is: It is axiom SI, which says that from every state of a 
simple system one can start an irreversible adiabatic process. In combination with A1-A6 and 
the convexity axiom A7, this is equivalent to Caratheodory's principle. Moreover, together with 
the other simple system axioms, in particular the assumption about the pressure, S2, it leads to 
Planck's principle, Theorem 3.4, which states the impossibility of extracting energy adiabatically 
from a simple system at fixed work coordinates. Hence, the entropy not only exists, but also it is 
nowhere locally constant. This additional property of entropy is a precise version of the classical 
statements of the second law. By contrast, an entropy having level sets like the temperature in Fig. 
8 would allow the construction of a perpetual motion machine of the second kind. 

It would be mistake, however, to underestimate the role played by the axioms other than SI. 
They are all part of the structure of thermodynamics as presented here, and conspire to produce 
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an entropy function that separates precisely the possible from the impossible and has the convexity 
and regularity properties required in the practical application of thermodynamics . 
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LIST OF SYMBOLS 



A. Some Standard Mathematical Symbols 



a E A OT A3 a means 'the point a is an element of the set A\ 

a ^ A means 'the point a is not an element of the set A\ 

A G B or B D A means 'the set A is in the set B\ 

An B is the set of objects that are in the set A and in the set B. 

AU B is the set of objects that are either in the set A or in the set B or in both sets. 

Ax B is the set consisting of pairs (a, b) with a e A and b e B. 

{a : P} means the set of objects a having property P. 

a := b or b =: a means 'the quantity a is defined by 6'. 

P ^ Q means 'P implies Q\ 

is n-dimensional Euclidean space whose points are 

n-tuples (xi, ...,Xn) of real numbers, 

[s, t] means the closed interval s < x <t. 

dA means the boundary of a set A. 



B. Special Symbols 

X -<Y ('X precedes y) means that the state Y is 

adiabatically accessible from the state X. (Sect. II. A. 2) 
X 7^ y ('X does not precede y ) means that Y is not adiabatically 

accessible from X. (Sect. II. A. 2) 

X Y ('X strictly precedes Y') means that Y is adiabatically 

accessible from X, but X is not accessible from Y. (Sect. II.A.2) 

X '^Y ('X is adiabatically equivalent to Y') means that 

X ^Y and F -< X. (Sect. II.A.2) 

X r^Y means that the states X and Y are in thermal equilibrium. (Sect. IV. A) 

Ax the 'forward sector' of a state X eT, i.e., {Y gT : X ^Y}. (Sect. II.F) 

tX a copy of the state X, but scaled by a factor t. (Sect. II.A.l) 

r(*) the state space consisting of scaled states tX, with X eV. (Sect. II.A.l) 

tX + (1 — t)Y a convex combination of states X and F in a 

state space with a convex structure. (Sect. II.F) 
S(Xo,Xi) the 'strip' {X gT : Xq ~< X -< Xi} between the adiabats 

through Xo and Xi G T, Xq ^ Xi. (Sect. II.D) 

Px the projection of dAx onto the space of work coordinates, 

for X in the state space of a simple system F c R""*"^, 

i.e., = {y G R" : {U, V) G dAx for some U G R}. (Sect. III.C) 

p the projection onto the space of work coordinates of a simple system F, 

i.e., if X = {U, V) G F, then p{X) = V. (Sect. IV. A) 
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INDEX OF TECHNICAL TERMS 



Additivity of entropy (Sect. II. B) 

Adiabat (Sect. III.B) 

Adiabatic accessibility (Sect. II. A. 2) 

Adiabatic equivalence (Sect. II.A.2) 

Adiabatic process (Sect. II.A.l) 

Boundary of a forward sector (Sect. III.B) 

Canonical entropy (Sect. II. D) 

Cancellation law (Sect. II. C) 

Caratheodory's principle (Sect. II. G) 

Carnot efficiency (Sect. V.A) 

Comparable states (Sect. II.A.2) 

Comparison hypothesis (CH) (Sect. II. C) 

Composition of systems (Sect. II.A.l) 

Consistent entropies (Sect. II. E) 

Convex state space (Sect. II. F) 

Degenerate simple system (=thermometer) (Sect. III. A) 

Entropy (Sect. II. B) 

Entropy calibrator (Sect. IV. A) 

Entropy constants (Sect. II. E) 

Entropy function on a state space (Sect. II. D) 

Entropy principle (EP) (Sect. II. B) 

Extensivity of entropy (Sect. II. B) 

First law of thermodynamics (Sect. III. A) 

Forward sector (Sect. II. F) 

Generalized ordering (Sect. II. D) 

Internal energy (Sect. III. A) 

Irreversible process (Sect. II. G) 

Isotherm (Sect. IV.A) 

Lipschitz continuity (Sect. III.B) 

Lower temperature (Sect. V.A) 

Multiple scaled copy (Sect. II.A.l) 

Planck's principle (Sect. III.C) 

Pressure (Sect. III.B) 

Reference points for entropy (Sect. II. D) 

Second law of thermodynamics (Sect. II. B) 

Scaled copy (Sect. II.A.l) 

Scaled product (Sect. II.A.l) 

Simple system (Sect. Ill) 

Stability (Sect. II. C) 

State (Sect. II.A.l) 

State space (Sect. II.A.l) 

Subsystem (Sect. II.A.l) 
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System (Sect. II.A.l) 

Temperature (Sect. V.A) 

Thermal contact (Sect. IV.A) 

Thermal equilibration (Sect. IV.A) 

Thermal equilibrium (Sect. IV.A) 

Thermal join (Sect. IV.A) 

Thermal reservoir (Sect. V.A) 

Thermal splitting (Sect. III.C) 

Thermometer (=degenerate simple system) (Sect. III. A) 

Transversality (Sect. IV.A) 

Upper temperature (Sect. V.A) 

Work coordinate (Sect. III. A) 

Zeroth law of thermodynamics (Sect. IV.A) 
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Figure 1. An example of a violent adiabatic process. The system in an equilibrium state X is 
transformed by mechanical means to another equilibrium state Y. 
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Figure 2. The entropy of a state X is determined, according to formula 2.14, by tlie amount of 
substance in state Xi that can be transformed to X with the aid of a complementary amount of 
substance in the state Xq. 
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Figure 3. This illustrates axiom A7 and Theorem 2.6 which says that if states Y and Z can be 
reached adiabatically from a state X and if the state space has a convex structure then convex 
combinations of Y and Z are also in the forward sector of X. 
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Figure 4. This illustrates the energy U and work coordinates F of a simple system. The state 

space (dashed line) is always a convex set and the forward sector Ax of any point X is always a 
convex subset of the state space. The heavy dark curve denotes the boundary dAx of Ax and 
consists of points that are adiabatically equivalent to X (as Theorem 3.7 states). The projection 
of this boundary on the work coordinates is Px which can be strictly smaller than the projection 
of Ax. 
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Figure 5. The top figure illustrates how the forward sectors of a simple system are nested. The 
adiabats (i.e., the boundaries of the forward sectors) do not overlap. The 3 points are related by 
X ~<~<Y ~<~< Z. The lower figure shows what, in principle, could go wrong — but doesn't, according 
to Theorem 3.7. The top pair of adiabats have a point in common but neither W ^ Z nor Z <W 
holds. The bottom pair is a bit more subtle; X ^ Y and Y is on the boundary of the forward 
sector of X, but X is not in the forward sector of Y. 
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Figure 6. This illustrates the transversality axiom T4. For every state X there are points Xq and 
Xi on both sides of the adiabat through X that are in thermal equilibrium with each other. The 
points Yq and Yi (corresponding to some other point Y) need not be in thermal equilibrium with 
Xq and Xi. 
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Figure 7. This shows the state space of two 'thermometers', which means that there are only 

energy coordinates. The forward sectors of X and Y are shown under the assumption that the only 
ahowed adiabatic operations are thermal equilibration (which moves X to X' and Y to Y') and 
rubbing (which increases, but never decreases the energy). We see clearly that these sectors are 
not nested (i.e., one does not lie inside the other), as they are for compounds of simple systems, 
satisfying the transversality axiom T4. 
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Figure 8. This shows isotherms in the {U, V) plane near the triple point of a simple system. If one 
substituted pressure or temperature for U or V as a coordinate then the full two-dimensional region 
would be compressed into a one-dimensional region. In the triple point region the temperature is 
constant, which shows that isotherms need not be one-dimensional curves. 
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